MySQL Connector/Python not closing connection explicitly

MySQL Connector/Python not closing connection explicitly - python

I have the following:
class FooData(object):
def __init__(self):
...
try:
self.my_cnf = os.environ['HOME'] + '/.my.cnf'
self.my_cxn = mysql.connector.connect(option_files=self.my_cnf)
self.cursor = self.my_cxn.cursor(dictionary=True)
except mysql.connector.Error as err:
if err.errno == 2003:
self.my_cnf = None
self.my_cxn = None
self.cursor = None
I am able to use my_cxn and cursor without any obvious failure. I never explicitly terminate the connection, and have observed the following messages in my mysql error log though...
2017-01-08T15:16:09.355190Z 132 [Note] Aborted connection 132 to db:
'mydatabase' user: 'myusername' host: 'localhost'
(Got an error reading communication packets)
Am I going about this the wrong way? Would it be more efficient for me to initialize my connector and cursor every time I need to run a query?
What do I need to look for on the mysql config to avoid these aborted connection?
Separately, I also observe these messages in my error logs frequently:
2017-01-06T15:28:45.203067Z 0 [Warning] Changed limits: max_open_files: 1024
(requested 5000)
2017-01-06T15:28:45.205191Z 0 [Warning] Changed limits: table_open_cache: 431
(requested 2000)
Is it related to the above? What does it mean and how can I resolve it?
I tried various solutions involving /lib/systemd/system/mysql.service.d/limits.conf and other configuration settings but couldn't get any of them to work.

It's not a config issue. When you are done with a connection you should close it by explicitly calling close. It is generally a best practice to maintain the connection for a long time as creating one takes time. It's not possible to tell from your code snippet where would be the best place to close it - it's whenever you're "done" with it; perhaps at the end of your __main__ method. Similarly, you should close the cursor explicitly when your done with it. Typically that happens after each query.
So, maybe something like:
class FooData(object):
def __init__(self):
...
try:
self.my_cnf = os.environ['HOME'] + '/.my.cnf'
self.my_cxn = mysql.connector.connect(option_files=self.my_cnf)
def execute_some_query(self, query_info):
"""Runs a single query. Thus it creates a cursor to run the
query and closes it when it's done."""
# Note that cursor is not a member variable as it's only for the
# life of this one query
cursor = self.my_cxn.cursor(dictionary=True)
cursor.execute(...)
# All done, close the cursor
cursor.close()
def close():
"""Users of this class should **always** call close when they are
done with this class so it can clean up the DB connection."""
self.my_cxn.close()
You might also look into the Python with statement for a nice way to ensure everything is always cleaned up.

I rewrote my class above to look like this...
class FooData(object):
def __init__(self):
self.myconfig = {
'option_files': os.environ['HOME'] + '/.my.cnf',
'database': 'nsdata'
}
self.mysqlcxn = None
def __enter__(self):
try:
self.mysqlcxn = mysql.connector.connect(**self.myconfig)
except mysql.connector.Error as err:
if err.errno == 2003:
self.mysqlcxn = None
return self
def __exit__(self, exc_type, exc_value, traceback):
if self.mysqlcxn is not None and self.mysqlcxn.is_connected():
self.mysqlcxn.close()
def etl(self)
...
I can then use with ... as and ensure that I am cleaning up properly.
with FooData() as obj:
obj.etl()
The Aborted connection messages are thus properly eliminated.
Oliver Dain's response set me on the right path and Explaining Python's '__enter__' and '__exit__' was very helpful in understanding the right way to implement my Class.

Related

Python SQLalchemy - can I pass the connection object between functions?

I have a python application that is reading from mysql/mariadb, uses that to fetch data from an api and then inserts results into another table.
I had setup a module with a function to connect to the database and return the connection object that is passed to other functions/modules. However, I believe this might not be a correct approach. The idea was to have a small module that I could just call whenever I needed to connect to the db.
Also note, that I am using the same connection object during loops (and within the loop passing to the db_update module) and call close() when all is done.
I am also getting some warnings from the db sometimes, those mostly happen at the point where I call db_conn.close(), so I guess I am not handling the connection or session/engine correctly. Also, the connection id's in the log warning keep increasing, so that is another hint, that I am doing it wrong.
[Warning] Aborted connection 351 to db: 'some_db' user: 'some_user' host: '172.28.0.3' (Got an error reading communication packets)
Here is some pseudo code that represents the structure I currently have:
################
## db_connect.py
################
# imports ...
from sqlalchemy import create_engine
def db_connect():
# get env ...
db_string = f"mysql+pymysql://{db_user}:{db_pass}#{db_host}:{db_port}/{db_name}"
try:
engine = create_engine(db_string)
except Exception as e:
return None
db_conn = engine.connect()
return db_conn
################
## db_update.py
################
# imports ...
def db_insert(db_conn, api_result):
# ...
ins_qry = "INSERT INTO target_table (attr_a, attr_b) VALUES (:a, :b);"
ins_qry = text(ins_qry)
ins_qry = ins_qry.bindparams(a = value_a, b = value_b)
try:
db_conn.execute(ins_qry)
except Exception as e:
print(e)
return None
return True
################
## main.py
################
from sqlalchemy import text
from db_connect import db_connect
from db_update import db_insert
def run():
try:
db_conn = db_connect()
if not db_conn:
return False
except Exception as e:
print(e)
qry = "SELECT *
FROM some_table
WHERE some_attr IN (:some_value);"
qry = text(qry)
search_run_qry = qry.bindparams(
some_value = 'abc'
)
result_list = db_conn.execute(qry).fetchall()
for result_item in result_list:
## do stuff like fetching data from api for every record in the query result
api_result = get_api_data(...)
## insert into db:
db_ins_status = db_insert(db_conn, api_result)
## ...
db_conn.close
run()
EDIT: Another question:
a) Is it ok in a loop, that does an update on every iteration to use the same connection, or would it be wiser to instead pass the engine to the run() function and call db_conn = engine.connect() and db_conn.close() just before and after each update?
b) I am thinking about using ThreadPoolExecutor instead of the loop for the API calls. Would this have implications on how to use the connection, i.e. can I use the same connection for multiple threads that are doing updates to the same table?
Note: I am not using the ORM feature mostly because I have a strong DWH/SQL background (though not so much as DBA) and I am used to writing even complex sql queries. I am thinking about switching to just using PyMySQL connector for that reason.
Thanks in advance!

Yes you can return/pass connection object as parameter but what is the aim of db_connect method, except testing connection ? As I see there is no aim of this db_connect method therefore I would recommend you to do this as I done it before.
I would like to share a code snippet from one of my project.
def create_record(sql_query: str, data: tuple):
try:
connection = mysql_obj.connect()
db_cursor = connection.cursor()
db_cursor.execute(sql_query, data)
connection.commit()
return db_cursor, connection
except Exception as error:
print(f'Connection failed error message: {error}')
and then using this one as for another my need
db_cursor, connection, query_data = fetch_data(sql_query, query_data)
and after all my needs close the connection with this method and method call.
def close_connection(connection, db_cursor):
"""
This method used to close SQL server connection
"""
db_cursor.close()
connection.close()
and calling method
close_connection(connection, db_cursor)
I am not sure can I share my github my check this link please. Under model.py you can see database methods and to see how calling them check it main.py
Best,
Hasan.

PyMySQL is getting slower each consult at database

I don't know what I'm doing wrong to connect with MySQL database...
class DaoImportantInfo:
def __init__(self, db=None):
self.db = db
self.config = ConfigConnection()
self.tabelas = TabelasConfig()
def getAll(self):
self.db.connect()
cursor = self.db.cursor()
strComando = f"""SELECT tetosPrevId, dataValidade, valor FROM {self.tabelas.tblTetosPrev} ORDER BY dataValidade DESC"""
try:
cursor.execute(strComando)
logPrioridade(f'SELECT<getAllTetos>____________{self.tabelas.tblTetosPrev};', TipoEdicao.select, Prioridade.saidaComun)
return cursor.fetchall()
except:
raise Warning(f'Erro SQL - getAllTetos({self.config.banco}) <INSERT {self.tabelas.tblTetosPrev}>')
finally:
self.disconectBD(cursor)
def disconectBD(self, cursor):
cursor.close()
self.db.close()
I have a DaoImportantInfo class, who has the connection and some configuration properties, a method getAll who returns all info in tblTetosPrev table and disconnectBD that shold turn off the connection.
The weird situation is that every time I call getAll it takes more time to fetch all the data. The first time it takes less than one second, then, in second time, it takes 2 seconds, then 19 seconds...
I read that is important to close the connection after execute a script, so I don't know what is the problem. Can anybody help me?
I'm using Python 3.8 and Linux PopOs.

Django with Peewee Connection Pooling MySQL disconnect

I'm running a Django project with Peewee in Python 3.6 and trying to track down what's wrong with the connection pooling. I keep getting the following error on the development server (for some reason I never experience this issue on my local machine):
Lost connection to MySQL server during query
The repro steps are reliable and are:
Restart Apache on the instance.
Go to my Django page and press a button which triggers a DB operation.
Works fine.
Wait exactly 10 minutes (I've tested enough to get the exact number).
Press another button to trigger another DB operation.
Get the lost connection error above.
The code is structured such that I have all the DB operations inside an independent Python module which is imported into the Django module.
In the main class constructor I'm setting up the DB as such:
from playhouse.pool import PooledMySQLDatabase
def __init__(self, host, database, user, password, stale_timeout=300):
self.mysql_db = PooledMySQLDatabase(host=host, database=database, user=user, password=password, stale_timeout=stale_timeout)
db_proxy.initialize(self.mysql_db)
Every call which needs to make calls out to the DB are done like this:
def get_user_by_id(self, user_id):
db_proxy.connect(reuse_if_open=True)
user = (User.get(User.user_id == user_id))
db_proxy.close()
return {'id': user.user_id, 'first_name': user.first_name, 'last_name': user.last_name, 'email': user.email }
I looked at the wait_timeout value on the MySQL instance and its value is 3600 so that doesn't seem to be the issue (and I tried changing it anyway just to see).
Any ideas on what I could be doing wrong here?
Update:
I found that the /etc/my.cnf configuration file for MySQL has the wait-timeout value set to 600, which matches what I'm experiencing. I don't know why this value doesn't show when I runSHOW VARIABLES LIKE 'wait_timeout'; on the MySQL DB (that returns 3600) but it does seem likely the issue is coming from the wait timeout.
Given this I tried setting the stale timeout to 60, assuming that if it's less than the wait timeout it might fix the issue but it didn't make a difference.

You need to be sure you're recycling the connections properly -- that means that when a request begins you open a connection and when the response is delivered you close the connection. The pool is not recycling the conn most likely because you're never putting it back in the pool, so it looks like its still "in use". This can easily be done with middleware and is described here:
http://docs.peewee-orm.com/en/latest/peewee/database.html#django

I finally came up with a fix which works for my case, after trying numerous ideas. It's not ideal but it works. This post on Connection pooling pointed me in the right direction.
I created a Django middleware class and configured it to be the first in the list of Django middleware.
from peewee import OperationalError
from playhouse.pool import PooledMySQLDatabase
database = PooledMySQLDatabase(None)
class PeeweeConnectionMiddleware(object):
CONN_FAILURE_CODES = [ 2006, 2013, ]
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
if database.database: # Is DB initialized?
response = None
try:
database.connect(reuse_if_open=True)
with database.atomic() as transaction:
try:
response = self.get_response(request)
except:
transaction.rollback()
raise
except OperationalError as exception:
if exception.args[0] in self.CONN_FAILURE_CODES:
database.close_all()
database.connect()
response = None
with database.atomic() as transaction:
try:
response = self.get_response(request)
except:
transaction.rollback()
raise
else:
raise
finally:
if not database.is_closed():
database.close()
return response
else:
return self.get_response(request)

Correct way of keeping mysql connection [duplicate]

This question already has answers here:
How to enable MySQL client auto re-connect with MySQLdb?
(8 answers)
Closed 4 years ago.
I have an application running 24/7 which uses mysql. Different functions of it use mysql. One way to implement it is to create a global mysql instance in the application like this:
self.db = MySQLdb.connect(
host=self.settings.MYSQL_HOST_LOCAL,
user=self.settings.MYSQL_USER,
passwd=self.settings.MYSQL_PASS,
db=self.settings.MYSQL_DB,
use_unicode=True,
charset="utf8",
)
and execute commands using self.db.execute(...). By doing this, the application uses 1 connection. The other way is to create connection every time I need to execute a transaction.
Approach 1, prevents the application from creating and deleting connections over and over but it will face "mysql gone away" problem if it stays ideal. Approach 2, doesn't have a problem with "mysql gone away", but it has too much I/O.
I am pretty sure neither these approcaches are the right ones, but what is the right approach?

One way to do is to create a connection every time you need to execute. You can also create a function to do it for you. This is how it meant to be. It is not too much I/O..
You can also do the following:
while True:
try:
db_session.execute('SELECT * FROM table')
break
except SQLAlchemyError:
db_session.rollback()
If the connection has go away, this will raise an exception, the session will be rollbackd, it'll try again is likely to succeed. (The first solution is much better)

In Approach 1, according to the application scenario 'MySQL server has gone away' may happens because
You have encountered a timeout on the server side and the automatic reconnection in the client is disabled (the reconnect flag in the MYSQL structure is equal to 0).
You can add a decorator to any db handler function like this to reconnect MySQL db when exception 'MySQL server has gone away' raised.
class DB:
"""Database interface"""
def retry(func):
def call(self, *args, **kwargs):
lock.acquire()
try:
return func(self, *args, **kwargs)
except MySQLdb.Error, e:
if 'MySQL server has gone away' in str(e):
# reconnect MySQL
self.connect_mysql()
else:
# No need to retry for other reasons
pass
finally:
lock.release()
return call
def __init__(self):
pass
def connect_mysql(self):
# do connection
#retry
def db_handler_function(self):
# do something

Python and Django OperationalError (2006, 'MySQL server has gone away')

Original: I have recently started getting MySQL OperationalErrors from some of my old code and cannot seem to trace back the problem. Since it was working before, I thought it may have been a software update that broke something. I am using python 2.7 with django runfcgi with nginx. Here is my original code:
views.py
DBNAME = "test"
DBIP = "localhost"
DBUSER = "django"
DBPASS = "password"
db = MySQLdb.connect(DBIP,DBUSER,DBPASS,DBNAME)
cursor = db.cursor()
def list(request):
statement = "SELECT item from table where selected = 1"
cursor.execute(statement)
results = cursor.fetchall()
I have tried the following, but it still does not work:
views.py
class DB:
conn = None
DBNAME = "test"
DBIP = "localhost"
DBUSER = "django"
DBPASS = "password"
def connect(self):
self.conn = MySQLdb.connect(DBIP,DBUSER,DBPASS,DBNAME)
def cursor(self):
try:
return self.conn.cursor()
except (AttributeError, MySQLdb.OperationalError):
self.connect()
return self.conn.cursor()
db = DB()
cursor = db.cursor()
def list(request):
cursor = db.cursor()
statement = "SELECT item from table where selected = 1"
cursor.execute(statement)
results = cursor.fetchall()
Currently, my only workaround is to do MySQLdb.connect() in each function that uses mysql. Also I noticed that when using django's manage.py runserver, I would not have this problem while nginx would throw these errors. I doubt that I am timing out with the connection because list() is being called within seconds of starting the server up. Were there any updates to the software I am using that would cause this to break/is there any fix for this?
Edit: I realized that I recently wrote a piece of middle-ware to daemonize a function and this was the cause of the problem. However, I cannot figure out why. Here is the code for the middle-ware
def process_request_handler(sender, **kwargs):
t = threading.Thread(target=dispatch.execute,
args=[kwargs['nodes'],kwargs['callback']],
kwargs={})
t.setDaemon(True)
t.start()
return
process_request.connect(process_request_handler)

Sometimes if you see "OperationalError: (2006, 'MySQL server has gone away')", it is because you are issuing a query that is too large. This can happen, for instance, if you're storing your sessions in MySQL, and you're trying to put something really big in the session. To fix the problem, you need to increase the value of the max_allowed_packet setting in MySQL.
The default value is 1048576.
So see the current value for the default, run the following SQL:
select ##max_allowed_packet;
To temporarily set a new value, run the following SQL:
set global max_allowed_packet=10485760;
To fix the problem more permanently, create a /etc/my.cnf file with at least the following:
[mysqld]
max_allowed_packet = 16M
After editing /etc/my.cnf, you'll need to restart MySQL or restart your machine if you don't know how.

As per the MySQL documentation, your error message is raised when the client can't send a question to the server, most likely because the server itself has closed the connection. In the most common case the server will close an idle connection after a (default) of 8 hours. This is configurable on the server side.
The MySQL documentation gives a number of other possible causes which might be worth looking into to see if they fit your situation.
An alternative to calling connect() in every function (which might end up needlessly creating new connections) would be to investigate using the ping() method on the connection object; this tests the connection with the option of attempting an automatic reconnect. I struggled to find some decent documentation for the ping() method online, but the answer to this question might help.
Note, automatically reconnecting can be dangerous when handling transactions as it appears the reconnect causes an implicit rollback (and appears to be the main reason why autoreconnect is not a feature of the MySQLdb implementation).

This might be due to DB connections getting copied in your child threads from the main thread. I faced the same error when using python's multiprocessing library to spawn different processes. The connection objects are copied between processes during forking and it leads to MySQL OperationalErrors when making DB calls in the child thread.
Here's a good reference to solve this: Django multiprocessing and database connections

For me this was happening in debug mode.
So I tried Persistent connections in debug mode, checkout the link: Django - Documentation - Databases - Persistent connections.
In settings:
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'dbname',
'USER': 'root',
'PASSWORD': 'root',
'HOST': 'localhost',
'PORT': '3306',
'CONN_MAX_AGE': None
},

Check if you are allowed to create mysql connection object in one thread and then use it in another.
If it's forbidden, use threading.Local for per-thread connections:
class Db(threading.local):
""" thread-local db object """
con = None
def __init__(self, ...options...):
super(Db, self).__init__()
self.con = MySQLdb.connect(...options...)
db1 = Db(...)
def test():
"""safe to run from any thread"""
cursor = db.con.cursor()
cursor.execute(...)

This error is mysterious because MySQL doesn't report why it disconnects, it just goes away.
It seems there are many causes of this kind of disconnection. One I just found is, if the query string too large, the server will disconnect. This probably relates to the max_allowed_packets setting.

I've been struggling with this issue too. I don't like the idea of increasing timeout on mysqlserver. Autoreconnect with CONNECTION_MAX_AGE doesn't work either as it was mentioned. Unfortunately I ended up with wrapping every method that queries the database like this
def do_db( callback, *arg, **args):
try:
return callback(*arg, **args)
except (OperationalError, InterfaceError) as e: # Connection has gone away, fiter it with message or error code if you could catch another errors
connection.close()
return callback(*arg, **args)
do_db(User.objects.get, id=123) # instead of User.objects.get(id=123)
As you can see I rather prefer catching the exception than pinging the database every time before querying it. Because catching an exception is a rare case. I would expect django to reconnect automatically but they seemed to refused that issue.

This error may occur when you try to use the connection after a time-consuming operation that doesn't go to the database. Since the connection is not used for some time, MySQL timeout is hit and the connection is silently dropped.
You can try calling close_old_connections() after the time-consuming non-DB operation so that a new connection is opened if the connection is unusable. Beware, do not use close_old_connections() if you have a transaction.

The most common issue regarding such warning, is the fact that your application has reached the wait_timeout value of MySQL.
I had the same problem with a Flask app.
Here's how I solved:
$ grep timeout /etc/mysql/mysql.conf.d/mysqld.cnf
# https://support.rackspace.com/how-to/how-to-change-the-mysql-timeout-on-a-server/
# wait = timeout for application session (tdm)
# inteactive = timeout for keyboard session (terminal)
# 7 days = 604800s / 4 hours = 14400s
wait_timeout = 604800
interactive_timeout = 14400
Observation: if you search for the variables via MySQL batch mode, the values will appear as it is. But If you perform SHOW VARIABLES LIKE 'wait%'; or SHOW VARIABLES LIKE 'interactive%';, the value configured for interactive_timeout, will appear to both variables, and I don't know why, but the fact is, that the values configured for each variable at '/etc/mysql/mysql.conf.d/mysqld.cnf', will be respected by MySQL.

How old is this code? Django has had databases defined in settings since at least .96. Only other thing I can think of is multi-db support, which changed things a bit, but even that was 1.1 or 1.2.
Even if you need a special DB for certain views, I think you'd probably be better off defining it in settings.

SQLAlchemy now has a great write-up on how you can use pinging to be pessimistic about your connection's freshness:
http://docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-pessimistic
From there,
from sqlalchemy import exc
from sqlalchemy import event
from sqlalchemy.pool import Pool
#event.listens_for(Pool, "checkout")
def ping_connection(dbapi_connection, connection_record, connection_proxy):
cursor = dbapi_connection.cursor()
try:
cursor.execute("SELECT 1")
except:
# optional - dispose the whole pool
# instead of invalidating one at a time
# connection_proxy._pool.dispose()
# raise DisconnectionError - pool will try
# connecting again up to three times before raising.
raise exc.DisconnectionError()
cursor.close()
And a test to make sure the above works:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger#localhost/test", echo_pool=True)
c1 = e.connect()
c2 = e.connect()
c3 = e.connect()
c1.close()
c2.close()
c3.close()
# pool size is now three.
print "Restart the server"
raw_input()
for i in xrange(10):
c = e.connect()
print c.execute("select 1").fetchall()
c.close()

I had this problem and did not have the option to change my configuration. I finally figured out that the problem was occurring 49500 records in to my 50000-record loop, because that was the about the time I was trying again (after having tried a long time ago) to hit my second database.
So I changed my code so that every few thousand records, I touched the second database again (with a count() of a very small table), and that fixed it. No doubt "ping" or some other means of touching the database would work, as well.

Firstly, You should make sure the MySQL session & global enviroments wait_timeout and interactive_timeout values. And secondly Your client should try to reconnect to the server below those enviroments values.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

MySQL Connector/Python not closing connection explicitly - python

Related

Python SQLalchemy - can I pass the connection object between functions?

PyMySQL is getting slower each consult at database

Django with Peewee Connection Pooling MySQL disconnect

Correct way of keeping mysql connection [duplicate]

Python and Django OperationalError (2006, 'MySQL server has gone away')

Categories

Resources