I'm used to do this:
from sqlalchemy.orm import sessionmaker
from sqlalchemy.engine import create_engine
Session = sessionmaker()
engine = create_engine("some connection db string", echo=False)
Session.configure(bind=engine)
db_con = Session()
try:
# DB MANIPULATION
finally:
db_con.close()
Is this a good habit? If so, why sqlalchemy does not permit you to do simply:
with Session() as db_con:
# DB MANIPULATION
?
No, this isn't good practice. It's easy to forget, and will make the code more confusing.
Instead, you can use the contextlib.closing context manager, and make that the only way to get a session.
# Wrapped in a custom context manager for better readability
#contextlib.contextmanager
def get_session():
with contextlib.closing(Session()) as session:
yield session
with get_session() as session:
session.add(...)
Firstly if you are done with the session object you should close the session. session.close will return the connection back to engine pool and if you are exiting the program you should dispose the engine pool with engine.dispose.
Now to your question. In most cases sessions will be used on long running applications like web server. Where it makes sense to centralize the session management. For example in flask-sqlalchemy session is created with start of each web-request and closed when the request of over.
Related
I have a route in my Flask app that spawns a process (using multiprocessing.Process) to do some background work. That process needs to be able to write to the database.
__init__.py:
from flask_sqlalchemy import SQLAlchemy
from project.config import Config
db = SQLAlchemy()
# app factory
def create_app(config_class=Config):
app = Flask(__name__)
app.config.from_object(Config)
db.init_app(app)
return app
And this is the relevant code that illustrates that way i'm spawning the process and using the db connection:
def worker(row_id):
db_session = db.create_scoped_session()
# Do stuff with db_session here
db_session.close()
#app.route('/worker/<row_id>/start')
def start(row_id):
p = Process(target=worker, args=(row_id,))
p.start()
return redirect('/')
The problem is that sometimes (not always) i have errors like this one:
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) insufficient data in "D" message lost synchronization with server: got message type "a", length 1668573551
I assume that this is related to the fact that there is another process accessing the database (because if i don't use a separate process, everything works fine) but i honestly can't find a way of fixing it. As you can see on my code, i tried used create_scoped_session() method as an attempt to fix this but the problem is the same.
Any help?
Ok so, i followed #antont 's hint and created a new sqlalchemy session inside the worker function this way and it worked flawlessly:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
def worker(row_id):
db_url = os.environ['DATABASE_URL']
db_engine = create_engine(db_url)
Session = sessionmaker(bind=db_engine)
db_session = Session()
# Do stuff with db_session here
db_session.close()
I am running into a problem that my Dash/Flask web app is using too many mysql resources when used for a longer time. Eventually the server becomes incredibly slow because it tries to keep too many database connections alive. The project started based on on this article and is still organised in a similar way: https://hackersandslackers.com/plotly-dash-with-flask/
Once I open an URL from the website each Dash callback seems to open it's own connection to the database. Apart from the callback Flask opens a database connection as well to store the user session. The amount of open connections at the same time isn't really a problem, but the fact the connections aren't closed once finished is.
I've tried different settings and ways to setup the database connection, but none of them solved the problem of open database connections after the request is finished. Eventually the database runs out of resources because it tries to keep too many database connections open and the web app becomes unusable.
I've tried
db.py
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
from sqlalchemy.pool import NullPool
from sqlalchemy.ext.declarative import declarative_base
app_engine = create_engine('databasestring', poolclass=NullPool)
db_app_session = scoped_session(sessionmaker(autocommit=False, autoflush=False, bind=app_engine))
Base = declarative_base()
Base.query = db_app_session.query_property()
def init_db():
Base.metadata.create_all(bind=app_engine)
and
db.py
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
from sqlalchemy.pool import NullPool
app_engine = create_engine('databasestring', poolclass=NullPool)
Session = sessionmaker(bind=app_engine)
And then import the db.py session / connection into the dash app.
Depending on the contents of db.py I use it in this way in the Dash app in the callback:
dash.py
#app.callback(
# Callback input/output
....
)
def update_graph(rows):
# ... Callback logic
session = Session()
domain: Domain = session.query(Domain).get(domain_id)
/*Do stuff */
session.close()
session.bind.dispose()
I've tried to close the database connections in the init.py of the Flask app with #app.after_request or #app.teardown_request but none of these seemed to work either.
init.py
#app.after_request
def after_request(response):
session = Session()
session.close()
session.bind.dispose()
return response
I am aware of Flask-alchemy package and tried that one as well but with similar results. When using similar code outside of Flask/Dash closing the connections after the code is finished does seem to work.
Adding the NullPool helped to get the connections close when code is executed outside of Flask/Dash, but not within the web app itself. So something still goes wrong within Flask/Dash, but I am unable to find what.
Who can point me into the right direction?
I've also found this issue and pin-pointed it to the login_required decorator. Essentially, each Dash view route has this decorator so any time the dash app is opened in Flask, it opens up a new DB connection, querying for the current user. I've brought it up on a GitHub post here.
I tried this out (in addition to the NullPool configuration) and it worked. Not sure if it's the right solution since it disposes of the database. Try it out and let me know.
#login.user_loader
def load_user(id):
user = db.session.query(User).filter(User.id == id).one_or_none()
db.session.close()
engine = db.get_engine(current_app)
engine.dispose()
return user
I have a multi tenancy python falcon app. Every tenant have their own database. On incoming request, I need to connect to tenant database.
But there is a situation here. Database configs are stored on another service and configs changing regularly.
I tried session create before process resource. But sql queries slowing down after this change. To make this faster, what should I do?
P.S. : I use PostgreSQL
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
import config
import json
import requests
class DatabaseMiddleware:
def __init__(self):
pass
def process_resource(self, req, resp, resource, params):
engineConfig = requests.get('http://database:10003/v1/databases?loadOnly=config&appId=06535108-111a-11e9-ab14-d663bd873d93').text
engineConfig = json.loads(engineConfig)
engine = create_engine(
'{dms}://{user}:{password}#{host}:{port}/{dbName}'.format(
dms= engineConfig[0]['config']['dms'],
user= engineConfig[0]['config']['user'],
password= engineConfig[0]['config']['password'],
host= engineConfig[0]['config']['host'],
port= engineConfig[0]['config']['port'],
dbName= engineConfig[0]['config']['dbName']
))
session_factory = sessionmaker(bind=engine,autoflush=True)
databaseSession = scoped_session(session_factory)
resource.databaseSession = databaseSession
def process_response(self, req, resp, resource, req_succeeded):
if hasattr(resource, 'mainDatabase'):
if not req_succeeded:
resource.databaseSession.rollback()
self.databaseSession.remove()
Your approach is probably wrong since it is against the intended usage pattern of engine instances described in engine disposal. The lifetime of engine instance should be the same as for the instance of your middleware.
The Engine refers to a connection pool, which means under normal circumstances, there are open database connections present while the Engine object is still resident in memory. When an Engine is garbage collected, its connection pool is no longer referred to by that Engine, and assuming none of its connections are still checked out, the pool and its connections will also be garbage collected, which has the effect of closing out the actual database connections as well. But otherwise, the Engine will hold onto open database connections assuming it uses the normally default pool implementation of QueuePool.
The Engine is intended to normally be a permanent fixture established up-front and maintained throughout the lifespan of an application. It is not intended to be created and disposed on a per-connection basis; it is instead a registry that maintains both a pool of connections as well as configurational information about the database and DBAPI in use, as well as some degree of internal caching of per-database resources.
In conjunction with SQLAlchemy, I use SQLService as an interface layer to SQLAlchemy's session manager and ORM layer, which nicely centralizes the core functionality of SQLAlchemy.
Here is my middleware component definition:
class DatabaseSessionComponent(object):
""" Initiates a new Session for incoming request and closes it in the end. """
def __init__(self, sqlalchemy_database_uri):
self.sqlalchemy_database_uri = sqlalchemy_database_uri
def process_resource(self, req, resp, resource, params):
resource.db = sqlservice.SQLClient(
self.sqlalchemy_database_uri,
model_class=BaseModel
)
def process_response(self, req, resp, resource):
if hasattr(resource, "db"):
resource.db.disconnect()
With its instantiation within the API's instantiation here:
api = falcon.API(
middleware=[
DatabaseSessionComponent(os.environ["SQLALCHEMY_DATABASE_URI"]),
]
)
I have a web application that runs long jobs that are independent of user sessions. To achieve this, I have an implementation for a thread-local Flask-SQLAlchemy session. The problem is a few times a day, I get a MySQL server has gone away error when I visit my site. The site always loads upon refresh. I think the issue is related to these thread-local sessions, but I'm not sure.
This is my implementation of a thread-local session scope:
#contextmanager
def thread_local_session_scope():
"""Provides a transactional scope around a series of operations.
Context is local to current thread.
"""
# See this StackOverflow answer for details:
# http://stackoverflow.com/a/18265238/1830334
Session = scoped_session(session_factory)
threaded_session = Session()
try:
yield threaded_session
threaded_session.commit()
except:
threaded_session.rollback()
raise
finally:
Session.remove()
And here is my standard Flask-SQLAlchemy session:
#contextmanager
def session_scope():
"""Provides a transactional scope around a series of operations.
Context is HTTP request thread using Flask-SQLAlchemy.
"""
try:
yield db.session
db.session.commit()
except Exception as e:
print 'Rolling back database'
print e
db.session.rollback()
# Flask-SQLAlchemy handles closing the session after the HTTP request.
Then I use both session context managers like this:
def build_report(tag):
report = _save_report(Report())
thread = Thread(target=_build_report, args=(report.id,))
thread.daemon = True
thread.start()
return report.id
# This executes in the main thread.
def _save_report(report):
with session_scope() as session:
session.add(report)
session.commit()
return report
# These executes in a separate thread.
def _build_report(report_id):
with thread_local_session_scope() as session:
report = do_some_stuff(report_id)
session.merge(report)
EDIT: Engine configurations
app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql://<username>:<password>#<server>:3306/<db>?charset=utf8'
app.config['SQLALCHEMY_POOL_RECYCLE'] = 3600
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
Try adding an
app.teardown_request(Exception=None)
Decorator, which executes at the end of each request. I am currently experiencing a similar issue, and it seems as if today I have actually resolved it using.
#app.teardown_request
def teardown_request(exception=None):
Session.remove()
if exception and Session.is_active:
print(exception)
Session.rollback()
I do not use Flask-SQLAlchemy Only Raw SQLAlchemy, so it may have differences for you.
From the Docs
The teardown callbacks are special callbacks in that they are executed
at at different point. Strictly speaking they are independent of the
actual request handling as they are bound to the lifecycle of the
RequestContext object. When the request context is popped, the
teardown_request() functions are called.
In my case, I open a new scoped_session for each request, requiring me to remove it at the end of each request (Flask-SQLAlchemy may not need this). Also, the teardown_request function is passed an Exception if one occured during the context. In this scenario, if an exception occured (possibly causing the transaction to not be removed, or need a rollback), we check if there was an exception, and rollback.
If this doesnt work for my own testing, the next thing I was going to do was a session.commit() at each teardown, just to make sure everything is flushing
UPDATE : it also appears MySQL invalidates connections after 8 hours, causing the Session to be corrupted.
set pool_recycle=3600 on your engine configuration, or to a setting < MySQL timeout. This in conjunction with proper session scoping (closing sessions) should do it.
I have 2 functions that need to be executed and the first takes about 4 hours to execute. Both use SQLAlchemy:
def first():
session = DBSession
rows = session.query(Mytable).order_by(Mytable.col1.desc())[:150]
for i,row in enumerate(rows):
time.sleep(100)
print i, row.accession
def second():
print "going onto second function"
session = DBSession
new_row = session.query(Anothertable).order_by(Anothertable.col1.desc()).first()
print 'New Row: ', new_row.accession
first()
second()
And here is how I define DBSession:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy import create_engine
engine = create_engine('mysql://blah:blah#blah/blahblah',echo=False,pool_recycle=3600*12)
DBSession = scoped_session(sessionmaker(autocommit=False, autoflush=False, bind=engine))
Base = declarative_base()
Base.metadata.bind = engine
first() finishes fine (takes about 4 hrs) and I see "going onto second function" printed then it immediately gives me an error:
sqlalchemy.exc.OperationalError: (OperationalError) (2006, 'MySQL server has gone away')
From reading the docs I thought assigning session=DBSession would get two different session instances and so that second() wouldn't timeout. I've also tried playing with pool_recycle and that doesn't seem to have any effect here. In the real world, I can't split first() and second() into 2 scripts: second() has to execute immediately after first()
Your engine (not session) keeps a pool of connections. When a mysql connection has not been used for several hours, mysql server closes the socket, this causes a "Mysql server has gone away" error when you try to use this connection. If you have a simple single-threaded script then calling create_engine with pool_size=1 will probably do the trick. If not, you can use events to ping the connection when it is checked out of the pool. This great answer has all the details:
SQLAlchemy error MySQL server has gone away
assigning session=DBSession would get two different session instances
That simply isn't true. session = DBSession is a local variable assignment, and you cannot override local variable assignment in Python (you can override instance member assignment, but that's unrelated).
Another thing to note is that scoped_session produces, by default, a thread-local scoped session (i.e. all codes in the same thread all have the same session). Since you call first() and second() in the same thread, they are one and the same session.
One thing you can do is to use regular (unscoped) session, just manage your session scope manually and create a new session in both function. Alternatively, you can check the doc about how to define custom session scope.
It doesn't look like you're getting separate Session instances. If the first query is successfully committing, then your Session could be expiring after that commit.
Try setting auto-expire to false for your session:
DBSession = scoped_session(sessionmaker(expire_on_commit=False, autocommit=False, autoflush=False, bind=engine))
and then commit later.