I am running into a problem that my Dash/Flask web app is using too many mysql resources when used for a longer time. Eventually the server becomes incredibly slow because it tries to keep too many database connections alive. The project started based on on this article and is still organised in a similar way: https://hackersandslackers.com/plotly-dash-with-flask/
Once I open an URL from the website each Dash callback seems to open it's own connection to the database. Apart from the callback Flask opens a database connection as well to store the user session. The amount of open connections at the same time isn't really a problem, but the fact the connections aren't closed once finished is.
I've tried different settings and ways to setup the database connection, but none of them solved the problem of open database connections after the request is finished. Eventually the database runs out of resources because it tries to keep too many database connections open and the web app becomes unusable.
I've tried
db.py
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
from sqlalchemy.pool import NullPool
from sqlalchemy.ext.declarative import declarative_base
app_engine = create_engine('databasestring', poolclass=NullPool)
db_app_session = scoped_session(sessionmaker(autocommit=False, autoflush=False, bind=app_engine))
Base = declarative_base()
Base.query = db_app_session.query_property()
def init_db():
Base.metadata.create_all(bind=app_engine)
and
db.py
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
from sqlalchemy.pool import NullPool
app_engine = create_engine('databasestring', poolclass=NullPool)
Session = sessionmaker(bind=app_engine)
And then import the db.py session / connection into the dash app.
Depending on the contents of db.py I use it in this way in the Dash app in the callback:
dash.py
#app.callback(
# Callback input/output
....
)
def update_graph(rows):
# ... Callback logic
session = Session()
domain: Domain = session.query(Domain).get(domain_id)
/*Do stuff */
session.close()
session.bind.dispose()
I've tried to close the database connections in the init.py of the Flask app with #app.after_request or #app.teardown_request but none of these seemed to work either.
init.py
#app.after_request
def after_request(response):
session = Session()
session.close()
session.bind.dispose()
return response
I am aware of Flask-alchemy package and tried that one as well but with similar results. When using similar code outside of Flask/Dash closing the connections after the code is finished does seem to work.
Adding the NullPool helped to get the connections close when code is executed outside of Flask/Dash, but not within the web app itself. So something still goes wrong within Flask/Dash, but I am unable to find what.
Who can point me into the right direction?
I've also found this issue and pin-pointed it to the login_required decorator. Essentially, each Dash view route has this decorator so any time the dash app is opened in Flask, it opens up a new DB connection, querying for the current user. I've brought it up on a GitHub post here.
I tried this out (in addition to the NullPool configuration) and it worked. Not sure if it's the right solution since it disposes of the database. Try it out and let me know.
#login.user_loader
def load_user(id):
user = db.session.query(User).filter(User.id == id).one_or_none()
db.session.close()
engine = db.get_engine(current_app)
engine.dispose()
return user
Related
I have a route in my Flask app that spawns a process (using multiprocessing.Process) to do some background work. That process needs to be able to write to the database.
__init__.py:
from flask_sqlalchemy import SQLAlchemy
from project.config import Config
db = SQLAlchemy()
# app factory
def create_app(config_class=Config):
app = Flask(__name__)
app.config.from_object(Config)
db.init_app(app)
return app
And this is the relevant code that illustrates that way i'm spawning the process and using the db connection:
def worker(row_id):
db_session = db.create_scoped_session()
# Do stuff with db_session here
db_session.close()
#app.route('/worker/<row_id>/start')
def start(row_id):
p = Process(target=worker, args=(row_id,))
p.start()
return redirect('/')
The problem is that sometimes (not always) i have errors like this one:
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) insufficient data in "D" message lost synchronization with server: got message type "a", length 1668573551
I assume that this is related to the fact that there is another process accessing the database (because if i don't use a separate process, everything works fine) but i honestly can't find a way of fixing it. As you can see on my code, i tried used create_scoped_session() method as an attempt to fix this but the problem is the same.
Any help?
Ok so, i followed #antont 's hint and created a new sqlalchemy session inside the worker function this way and it worked flawlessly:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
def worker(row_id):
db_url = os.environ['DATABASE_URL']
db_engine = create_engine(db_url)
Session = sessionmaker(bind=db_engine)
db_session = Session()
# Do stuff with db_session here
db_session.close()
I have one small app that use fastapi. The problem is when I deploy it to my server and trying to make a post request to route which contains some database operations, it just stuck and gives me 504 error. But on my local machine it working well.
Here is how my db connecting:
app.add_event_handler("startup", tasks.create_start_app_handler(app))
app.add_event_handler("shutdown", tasks.create_stop_app_handler(app))
I tried to revert db connection from startup application to creation of this connection with closing it in different route to test and its worked. Like:
#app.get("/")
async def create_item():
engine = create_engine(
DB_URL
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base()
t = engine.execute('SELECT * FROM auth_user').fetchone()
engine.dispose()
return t
How it's depend on events? Versions of postgresql are different, but I don't think it's because of it.
Currently I have deployment with 2 pods running in it. When I use psql command I can connect normally. So it only stuck in application, not it pod.
If somebody finds same, I fixed it by updating pgpoll from 4.2.2 to latest.
I'm trying to make "constant" (until logout) connection with DB for each user session.
On every api call user should be connected with his connection not one of available in pool
Im using Docker with tiangolo/uwsgi-nginx-flask:python3.7
Config:
from sqlalchemy import create_engine, text
from sqlalchemy.orm.session import sessionmaker
from sqlalchemy.orm import scoped_session
engine = create_engine('mysql://....')
Session = sessionmaker(bind=engine, autocommit=True)
DBSession = scoped_session(Session)
Route example:
sess = DBSession()
call= text("CALL call(:mode, :session,'');").params(mode=mode, session=session['session'])
res= sess.execute(call).fetchall()
return ...
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5101, ssl_context=context, debug=False, threaded=True)
After build im using show processlist to check results. Right now there is only one connection for all.
Maybe i misunderstood something..
Edit: I need to keep connection in order to access temp tables.
I have a multi tenancy python falcon app. Every tenant have their own database. On incoming request, I need to connect to tenant database.
But there is a situation here. Database configs are stored on another service and configs changing regularly.
I tried session create before process resource. But sql queries slowing down after this change. To make this faster, what should I do?
P.S. : I use PostgreSQL
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
import config
import json
import requests
class DatabaseMiddleware:
def __init__(self):
pass
def process_resource(self, req, resp, resource, params):
engineConfig = requests.get('http://database:10003/v1/databases?loadOnly=config&appId=06535108-111a-11e9-ab14-d663bd873d93').text
engineConfig = json.loads(engineConfig)
engine = create_engine(
'{dms}://{user}:{password}#{host}:{port}/{dbName}'.format(
dms= engineConfig[0]['config']['dms'],
user= engineConfig[0]['config']['user'],
password= engineConfig[0]['config']['password'],
host= engineConfig[0]['config']['host'],
port= engineConfig[0]['config']['port'],
dbName= engineConfig[0]['config']['dbName']
))
session_factory = sessionmaker(bind=engine,autoflush=True)
databaseSession = scoped_session(session_factory)
resource.databaseSession = databaseSession
def process_response(self, req, resp, resource, req_succeeded):
if hasattr(resource, 'mainDatabase'):
if not req_succeeded:
resource.databaseSession.rollback()
self.databaseSession.remove()
Your approach is probably wrong since it is against the intended usage pattern of engine instances described in engine disposal. The lifetime of engine instance should be the same as for the instance of your middleware.
The Engine refers to a connection pool, which means under normal circumstances, there are open database connections present while the Engine object is still resident in memory. When an Engine is garbage collected, its connection pool is no longer referred to by that Engine, and assuming none of its connections are still checked out, the pool and its connections will also be garbage collected, which has the effect of closing out the actual database connections as well. But otherwise, the Engine will hold onto open database connections assuming it uses the normally default pool implementation of QueuePool.
The Engine is intended to normally be a permanent fixture established up-front and maintained throughout the lifespan of an application. It is not intended to be created and disposed on a per-connection basis; it is instead a registry that maintains both a pool of connections as well as configurational information about the database and DBAPI in use, as well as some degree of internal caching of per-database resources.
In conjunction with SQLAlchemy, I use SQLService as an interface layer to SQLAlchemy's session manager and ORM layer, which nicely centralizes the core functionality of SQLAlchemy.
Here is my middleware component definition:
class DatabaseSessionComponent(object):
""" Initiates a new Session for incoming request and closes it in the end. """
def __init__(self, sqlalchemy_database_uri):
self.sqlalchemy_database_uri = sqlalchemy_database_uri
def process_resource(self, req, resp, resource, params):
resource.db = sqlservice.SQLClient(
self.sqlalchemy_database_uri,
model_class=BaseModel
)
def process_response(self, req, resp, resource):
if hasattr(resource, "db"):
resource.db.disconnect()
With its instantiation within the API's instantiation here:
api = falcon.API(
middleware=[
DatabaseSessionComponent(os.environ["SQLALCHEMY_DATABASE_URI"]),
]
)
I am running an application with Flask and Flask-SQLAlchemy.
from config import FlaskDatabaseConfig
from flask import Flask
from flask import request
from flask_migrate import Migrate
from flask_sqlalchemy import SQLAlchemy
application = Flask(__name__)
application.config.from_object(FlaskDatabaseConfig())
db = SQLAlchemy(application)
#application.route("/queue/request", methods=["POST"])
def handle_queued_request():
stuff()
return ""
def stuff():
# Includes database queries and updates and a call to db.session.commit()
# db.session.begin() and db.session.close() are not called
pass
if __name__ == "__main__":
application.run(debug=False, port=5001)
Now, from my understanding, by using Flask-SQLAlchemy I do not need to manage sessions on my own. So why am I getting the following error if I run several requests turn by turn to my endpoint?
sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30 (Background on this error at: http://sqlalche.me/e/3o7r)
I've tried using db.session.close() but then, instead of this error, my database updates are not committed properly. What am I doing incorrectly? Do I need to manually close connections with the database once a request has been handled?
I have found a solution to this. The issue was that I had a lot of processes that were "idle in transaction" because I did not call db.session.commit() after making certain database SELECT statements using Query.first()
To investigate this, I queried my (development) PostgreSQL database directly using:
SELECT * FROM pg_stat_activity
Just remove connection every time you make a query session to db.
products = db.session.query(Product).limit(20).all()
db.session.remove()