Python Flask MySQL: How to persistently store connection pool object? - python

After a long search, I could not find an answer to my question and if what I desire is even possible. My question concerns a MySQL connection implementation for a Flask API. What I desire to implement is as follows:
When the Flask app is started, a create_db_connection method is called, which created a number of mysql connections in a pooling object.
For each incoming request, a get_connection method is called, to get one connection from the poule
And of course when the request is ended, a method close_connection is called to close the connection and mark it available in the pool.
The problem I'm having concerns persistently storing the connection pool such that it can be re-used for each request.
create_db_connection method:
def create_db_connection():
print("-----INITIALISING-----")
db_pool = mysql.connector.pooling.MySQLConnectionPool(pool_name = "BestLiar_Public_API",
pool_size = 10,
autocommit = True,
pool_reset_session = True,
user = 'user',
password = 'pass',
host = 'hostel',
database =' db')
print("-----DB POOL INITIALISED-----")
// SOLUTION TO PERSISTENTLY SAVE db_pool OBJECT
get_connection method:
def __enter__(self):
try:
// SOLUTION TO FETCH THE db_pool OBJECT > NAMED AS db_pool IN LINE BELOW
self.con = db_pool.get_connection()
self.cur = self.con.cursor(dictionary=True)
if self.con.is_connected():
return {'cur': self.cur, 'con': self.con}
else:
raise NoConnectionError("No database connection", "Pool connection not connected")
except mysql.connector.PoolError:
raise SystemOverload("Too many requests, could not process","No pool connection available")
except:
raise NoConnectionError("No database connection", "Unknown reason")
close_connection method:
def __exit__(self, type, value, traceback):
if self.con:
self.cur.close()
self.con.close()
I have tried storing the db_pool object as a global variable (undesirable) and have tried the flask global object (only works for one request).
Anyone who has the key to the solution?

Related

Python SQLalchemy - can I pass the connection object between functions?

I have a python application that is reading from mysql/mariadb, uses that to fetch data from an api and then inserts results into another table.
I had setup a module with a function to connect to the database and return the connection object that is passed to other functions/modules. However, I believe this might not be a correct approach. The idea was to have a small module that I could just call whenever I needed to connect to the db.
Also note, that I am using the same connection object during loops (and within the loop passing to the db_update module) and call close() when all is done.
I am also getting some warnings from the db sometimes, those mostly happen at the point where I call db_conn.close(), so I guess I am not handling the connection or session/engine correctly. Also, the connection id's in the log warning keep increasing, so that is another hint, that I am doing it wrong.
[Warning] Aborted connection 351 to db: 'some_db' user: 'some_user' host: '172.28.0.3' (Got an error reading communication packets)
Here is some pseudo code that represents the structure I currently have:
################
## db_connect.py
################
# imports ...
from sqlalchemy import create_engine
def db_connect():
# get env ...
db_string = f"mysql+pymysql://{db_user}:{db_pass}#{db_host}:{db_port}/{db_name}"
try:
engine = create_engine(db_string)
except Exception as e:
return None
db_conn = engine.connect()
return db_conn
################
## db_update.py
################
# imports ...
def db_insert(db_conn, api_result):
# ...
ins_qry = "INSERT INTO target_table (attr_a, attr_b) VALUES (:a, :b);"
ins_qry = text(ins_qry)
ins_qry = ins_qry.bindparams(a = value_a, b = value_b)
try:
db_conn.execute(ins_qry)
except Exception as e:
print(e)
return None
return True
################
## main.py
################
from sqlalchemy import text
from db_connect import db_connect
from db_update import db_insert
def run():
try:
db_conn = db_connect()
if not db_conn:
return False
except Exception as e:
print(e)
qry = "SELECT *
FROM some_table
WHERE some_attr IN (:some_value);"
qry = text(qry)
search_run_qry = qry.bindparams(
some_value = 'abc'
)
result_list = db_conn.execute(qry).fetchall()
for result_item in result_list:
## do stuff like fetching data from api for every record in the query result
api_result = get_api_data(...)
## insert into db:
db_ins_status = db_insert(db_conn, api_result)
## ...
db_conn.close
run()
EDIT: Another question:
a) Is it ok in a loop, that does an update on every iteration to use the same connection, or would it be wiser to instead pass the engine to the run() function and call db_conn = engine.connect() and db_conn.close() just before and after each update?
b) I am thinking about using ThreadPoolExecutor instead of the loop for the API calls. Would this have implications on how to use the connection, i.e. can I use the same connection for multiple threads that are doing updates to the same table?
Note: I am not using the ORM feature mostly because I have a strong DWH/SQL background (though not so much as DBA) and I am used to writing even complex sql queries. I am thinking about switching to just using PyMySQL connector for that reason.
Thanks in advance!
Yes you can return/pass connection object as parameter but what is the aim of db_connect method, except testing connection ? As I see there is no aim of this db_connect method therefore I would recommend you to do this as I done it before.
I would like to share a code snippet from one of my project.
def create_record(sql_query: str, data: tuple):
try:
connection = mysql_obj.connect()
db_cursor = connection.cursor()
db_cursor.execute(sql_query, data)
connection.commit()
return db_cursor, connection
except Exception as error:
print(f'Connection failed error message: {error}')
and then using this one as for another my need
db_cursor, connection, query_data = fetch_data(sql_query, query_data)
and after all my needs close the connection with this method and method call.
def close_connection(connection, db_cursor):
"""
This method used to close SQL server connection
"""
db_cursor.close()
connection.close()
and calling method
close_connection(connection, db_cursor)
I am not sure can I share my github my check this link please. Under model.py you can see database methods and to see how calling them check it main.py
Best,
Hasan.

Django with Peewee Connection Pooling MySQL disconnect

I'm running a Django project with Peewee in Python 3.6 and trying to track down what's wrong with the connection pooling. I keep getting the following error on the development server (for some reason I never experience this issue on my local machine):
Lost connection to MySQL server during query
The repro steps are reliable and are:
Restart Apache on the instance.
Go to my Django page and press a button which triggers a DB operation.
Works fine.
Wait exactly 10 minutes (I've tested enough to get the exact number).
Press another button to trigger another DB operation.
Get the lost connection error above.
The code is structured such that I have all the DB operations inside an independent Python module which is imported into the Django module.
In the main class constructor I'm setting up the DB as such:
from playhouse.pool import PooledMySQLDatabase
def __init__(self, host, database, user, password, stale_timeout=300):
self.mysql_db = PooledMySQLDatabase(host=host, database=database, user=user, password=password, stale_timeout=stale_timeout)
db_proxy.initialize(self.mysql_db)
Every call which needs to make calls out to the DB are done like this:
def get_user_by_id(self, user_id):
db_proxy.connect(reuse_if_open=True)
user = (User.get(User.user_id == user_id))
db_proxy.close()
return {'id': user.user_id, 'first_name': user.first_name, 'last_name': user.last_name, 'email': user.email }
I looked at the wait_timeout value on the MySQL instance and its value is 3600 so that doesn't seem to be the issue (and I tried changing it anyway just to see).
Any ideas on what I could be doing wrong here?
Update:
I found that the /etc/my.cnf configuration file for MySQL has the wait-timeout value set to 600, which matches what I'm experiencing. I don't know why this value doesn't show when I runSHOW VARIABLES LIKE 'wait_timeout'; on the MySQL DB (that returns 3600) but it does seem likely the issue is coming from the wait timeout.
Given this I tried setting the stale timeout to 60, assuming that if it's less than the wait timeout it might fix the issue but it didn't make a difference.
You need to be sure you're recycling the connections properly -- that means that when a request begins you open a connection and when the response is delivered you close the connection. The pool is not recycling the conn most likely because you're never putting it back in the pool, so it looks like its still "in use". This can easily be done with middleware and is described here:
http://docs.peewee-orm.com/en/latest/peewee/database.html#django
I finally came up with a fix which works for my case, after trying numerous ideas. It's not ideal but it works. This post on Connection pooling pointed me in the right direction.
I created a Django middleware class and configured it to be the first in the list of Django middleware.
from peewee import OperationalError
from playhouse.pool import PooledMySQLDatabase
database = PooledMySQLDatabase(None)
class PeeweeConnectionMiddleware(object):
CONN_FAILURE_CODES = [ 2006, 2013, ]
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
if database.database: # Is DB initialized?
response = None
try:
database.connect(reuse_if_open=True)
with database.atomic() as transaction:
try:
response = self.get_response(request)
except:
transaction.rollback()
raise
except OperationalError as exception:
if exception.args[0] in self.CONN_FAILURE_CODES:
database.close_all()
database.connect()
response = None
with database.atomic() as transaction:
try:
response = self.get_response(request)
except:
transaction.rollback()
raise
else:
raise
finally:
if not database.is_closed():
database.close()
return response
else:
return self.get_response(request)

How can I reliably keep a SSH tunnel and MySQL connection open with my Python Flask API?

I have built an API in Flask that performs classification on text messages with Keras. I am currently using sshtunnel and MySQLdb to connect to a MySQL database to fetch messages from a remote database. The entire application is wrapped in a Docker container.
I am able to establish a connection to the remote database and successfully query it, but I am opening and closing a new ssh tunnel every time a POST request comes into the API, and this slows down performance.
I have tried to open a single ssh tunnel and database connection "to rule them all", but the connection gets stale if there is no activity after an hour or so, and then API requests take forever and a day to complete.
How have you done this? Is this slowness unavoidable or is there a way to periodically refresh the ssh and database connections?
This is how I am connecting to my database for every incoming request:
with SSHTunnelForwarder(
(host, 22),
ssh_username=ssh_username,
ssh_private_key=ssh_private_key,
remote_bind_address=(localhost, 3306)
) as server:
conn = db.connect(host=localhost,
port=server.local_bind_port,
user=user,
passwd=password,
db=database)
Okay, I figured it out. I created a DB object as suggested in this answer but with a slight modification. I kept track of the time that the connection to the database was created and then re-established the connection every 30 minutes. This means that one or two queries take slightly longer because I am rebuilding the connection to the database, but the rest of them run much faster and the connection won't go stale.
I've included some code below. I realize the code isn't perfect, but it's what has worked for me so far.
import MySQLdb as mydb
import time
import pandas as pd
from sshtunnel import SSHTunnelForwarder
class DB:
def __init__(self):
self.open_ssh_tunnel()
self.conn = None
self.server = None
self.connect()
self.last_connected_time = time.time()
def open_ssh_tunnel(self):
connection_success = False
while not connection_success:
try:
self.server = SSHTunnelForwarder(
(host, 22),
ssh_username=ssh_username,
ssh_private_key=ssh_private_key,
ssh_password=ssh_pwd,
remote_bind_address=(localhost, 3306))
connection_success = True
except:
time.sleep(0.5)
self.server.start()
def connect(self):
connection_success = False
while not connection_success:
try:
self.conn = mydb.connect(host=localhost,
port=server.local_bind_port,
user=user,
passwd=password,
db=database)
connection_success = True
except:
time.sleep(0.5)
def query(self, sql):
result = None
current_time = time.time()
if current_time - self.last_connected_time > 1600:
self.last_connected_time = current_time
self.server.close()
self.conn.close()
self.open_ssh_tunnel()
self.connect()
try:
result = pd.read_sql_query(sql, self.conn).values
self.conn.commit()
except:
self.server.close()
self.conn.close()
self.open_ssh_tunnel()
self.connect()
result = pd.read_sql_query(sql, self.conn).values
return result

MySQL Connector/Python not closing connection explicitly

I have the following:
class FooData(object):
def __init__(self):
...
try:
self.my_cnf = os.environ['HOME'] + '/.my.cnf'
self.my_cxn = mysql.connector.connect(option_files=self.my_cnf)
self.cursor = self.my_cxn.cursor(dictionary=True)
except mysql.connector.Error as err:
if err.errno == 2003:
self.my_cnf = None
self.my_cxn = None
self.cursor = None
I am able to use my_cxn and cursor without any obvious failure. I never explicitly terminate the connection, and have observed the following messages in my mysql error log though...
2017-01-08T15:16:09.355190Z 132 [Note] Aborted connection 132 to db:
'mydatabase' user: 'myusername' host: 'localhost'
(Got an error reading communication packets)
Am I going about this the wrong way? Would it be more efficient for me to initialize my connector and cursor every time I need to run a query?
What do I need to look for on the mysql config to avoid these aborted connection?
Separately, I also observe these messages in my error logs frequently:
2017-01-06T15:28:45.203067Z 0 [Warning] Changed limits: max_open_files: 1024
(requested 5000)
2017-01-06T15:28:45.205191Z 0 [Warning] Changed limits: table_open_cache: 431
(requested 2000)
Is it related to the above? What does it mean and how can I resolve it?
I tried various solutions involving /lib/systemd/system/mysql.service.d/limits.conf and other configuration settings but couldn't get any of them to work.
It's not a config issue. When you are done with a connection you should close it by explicitly calling close. It is generally a best practice to maintain the connection for a long time as creating one takes time. It's not possible to tell from your code snippet where would be the best place to close it - it's whenever you're "done" with it; perhaps at the end of your __main__ method. Similarly, you should close the cursor explicitly when your done with it. Typically that happens after each query.
So, maybe something like:
class FooData(object):
def __init__(self):
...
try:
self.my_cnf = os.environ['HOME'] + '/.my.cnf'
self.my_cxn = mysql.connector.connect(option_files=self.my_cnf)
def execute_some_query(self, query_info):
"""Runs a single query. Thus it creates a cursor to run the
query and closes it when it's done."""
# Note that cursor is not a member variable as it's only for the
# life of this one query
cursor = self.my_cxn.cursor(dictionary=True)
cursor.execute(...)
# All done, close the cursor
cursor.close()
def close():
"""Users of this class should **always** call close when they are
done with this class so it can clean up the DB connection."""
self.my_cxn.close()
You might also look into the Python with statement for a nice way to ensure everything is always cleaned up.
I rewrote my class above to look like this...
class FooData(object):
def __init__(self):
self.myconfig = {
'option_files': os.environ['HOME'] + '/.my.cnf',
'database': 'nsdata'
}
self.mysqlcxn = None
def __enter__(self):
try:
self.mysqlcxn = mysql.connector.connect(**self.myconfig)
except mysql.connector.Error as err:
if err.errno == 2003:
self.mysqlcxn = None
return self
def __exit__(self, exc_type, exc_value, traceback):
if self.mysqlcxn is not None and self.mysqlcxn.is_connected():
self.mysqlcxn.close()
def etl(self)
...
I can then use with ... as and ensure that I am cleaning up properly.
with FooData() as obj:
obj.etl()
The Aborted connection messages are thus properly eliminated.
Oliver Dain's response set me on the right path and Explaining Python's '__enter__' and '__exit__' was very helpful in understanding the right way to implement my Class.

How to completely remove database connection in sqlalchemy? Giving max_user_limit on python anywhere

I am hosting a flask application on Pythonanywhere. Where I have to make few queries from the database. While using MySQLdb I am able to close all the connection to database and don't get any error. But while using sqlalchemy some how connections to the database do not get closed.
This is my connection manager class which has a method defined to close the database connection.
class ConnectionManager:
def __init__(self):
self.base = declarative_base()
def get_db_session(self):
self.engine = create_engine(get_db_path())
self.base.metadata.bind = self.engine
self.session_maker = sessionmaker(bind = self.engine)
self.session = self.session_maker()
return self.session
def persist_in_db(self,record):
session = self.get_db_session()
session.add(record)
session.commit()
session.close()
def close_session(self):
self.session.close()
self.session_maker.close_all()
del self.session
del self.session_maker
del self.engine
#self.engine.dispose()
Before returning the response form the app I call close_session method.
So my question basically is where I am conceptually getting wrong and how can I completely remove database connection.
This is caused by connection pooling. You can disable connection pooling by using NullPool.
self.engine = create_engine(get_db_path(), poolclass=NullPool)
Be careful though, this may not be a good idea in a web app if each web request needs a DB connection.

Categories

Resources