SQLAlchemy connection hangs on AWS MySQL RDS reboot with failover - python

We have a Python server which uses SQLAlchemy to read/write data from an AWS MySQL MultiAZ RDS instance.
We're experiencing a behavior we'd like to avoid where whenever we trigger a failover reboot, a connection which was open already and then issues a statement hangs indefinitely. While this is something to expect according to AWS documentation, we would expect the Python MySQL connector would be able to cope with this situation.
The closest case we've found on the web is this google groups thread which talks about the issue and offers a solution regarding a Postgres RDS.
For example, the below script will hang indefinitely when initiating a failover reboot (adopted from the above mention google groups thread).
from datetime import datetime
from time import time, sleep
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.orm.scoping import scoped_session
from sqlalchemy.ext.declarative import declarative_base
import logging
current_milli_time = lambda: int(round(time() * 1000))
Base = declarative_base()
logging.basicConfig(format='%(asctime)s %(filename)s %(lineno)s %(process)d %(levelname)s: %(message)s', level="INFO")
class Message(Base):
__tablename__ = 'message'
id = Column(Integer, primary_key=True)
body = Column(String(450), nullable=False)
engine = create_engine('mysql://<username>:<password>#<db_host>/<db_name>',echo=False, pool_recycle=1800,)
session_maker = scoped_session(sessionmaker(bind=engine, autocommit=False, autoflush=False))
session = session_maker()
while True:
try:
ids = ''
start = current_milli_time()
for msg in session.query(Message).order_by(Message.id.desc()).limit(5):
ids += str(msg.id) + ', '
logging.info('({!s}) (took {!s} ms) fetched ids: {!s}'.format(datetime.now().time().isoformat(), current_milli_time() - start, ids))
start = current_milli_time()
m = Message()
m.body = 'some text'
session.add(m)
session.commit()
logging.info('({!s}) (took {!s} ms) inserted new message'.format(datetime.now().time().isoformat(), current_milli_time() - start))
except Exception, e:
logging.exception(e)
session.rollback()
finally:
session_maker.remove()
sleep(0.25)
We've tried playing with the connection timeouts but it seems the issue is related to an already opened connection which simply hangs once AWS switches to the failover instance.
Our question is - has anyone encountered this issue or has possible directions worthwhile checking?

IMHO, using SQL connector timeout to handle switchcover is like black magic. Each connector always act differently and difficult to diagnose.
If you read #univerio comment again, AWS will reassign a new IP address for the SAME RDS endpoint name. While doing the switching, your RDS endpoint name and old IP adderss is still inside your server instance DNS cache. So this is a DNS caching issues, and that's why AWS ask you to "clean up....".
Unless you restart SQLAlchemy to read the DNS again, there is no way that the session know something happens and switch it dynamically. And worst, the issue can be happens in connector that used by SQLAlchemy.
IMHO, it doesn't worth the effort to deal with switch over inside the code. I will just subscribe to AWS service like lambda that can act upon switch over events, trigger the app server to restart the connection, which suppose to reflect the new IP address.

Related

How to fix problem "Unable to complete the operation against any hosts" in Cassandra?

I have a pretty simple AWS Lambda function in which I connect to an Amazon Keyspaces for Cassandra database. This code in Python works, but from time to time I get the error. How do I fix this strange behavior? I have an assumption that you need to make additional settings when initializing the cluster. For example, set_max_connections_per_host. I would appreciate any help.
ERROR:
('Unable to complete the operation against any hosts', {<Host: X.XXX.XX.XXX:XXXX eu-central-1>: ConnectionShutdown('Connection to X.XXX.XX.XXX:XXXX was closed')})
lambda_function.py:
import sessions
cassandra_db_session = None
cassandra_db_username = 'your-username'
cassandra_db_password = 'your-password'
cassandra_db_endpoints = ['your-endpoint']
cassandra_db_port = 9142
def lambda_handler(event, context):
global cassandra_db_session
if not cassandra_db_session:
cassandra_db_session = sessions.create_cassandra_session(
cassandra_db_username,
cassandra_db_password,
cassandra_db_endpoints,
cassandra_db_port
)
result = cassandra_db_session.execute('select * from "your-keyspace"."your-table";')
return 'ok'
sessions.py:
from ssl import SSLContext
from ssl import CERT_REQUIRED
from ssl import PROTOCOL_TLSv1_2
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.policies import DCAwareRoundRobinPolicy
def create_cassandra_session(db_username, db_password, db_endpoints, db_port):
ssl_context = SSLContext(PROTOCOL_TLSv1_2)
ssl_context.load_verify_locations('your-path/AmazonRootCA1.pem')
ssl_context.verify_mode = CERT_REQUIRED
auth_provider = PlainTextAuthProvider(username=db_username, password=db_password)
cluster = Cluster(
db_endpoints,
ssl_context=ssl_context,
auth_provider=auth_provider,
port=db_port,
load_balancing_policy=DCAwareRoundRobinPolicy(local_dc='eu-central-1'),
protocol_version=4,
connect_timeout=60
)
session = cluster.connect()
return session
There isn't much point setting the max connections on the client side since AWS Lambdas are effectively "dead" between runs. For the same reason, the recommendation is to disable driver heartbeats (with idle_heartbeat_interval = 0) since there is no activity that occurs until the next time the function is called.
This doesn't necessarily cause the issue you are seeing but there's a good chance the connection is being reused by the driver after it has been closed server-side.
With the lack of public documentation on the inner-workings of AWS Keyspaces, it's difficult to know what is happening on the cluster. I've always suspected that AWS Keyspaces has a CQL-like API engine in front of a Dynamo DB so there are quirks like what you're seeing that are hard to track down since it requires knowledge only available internally at AWS.
FWIW the DataStax drivers aren't tested against AWS Keyspaces.
This is the biggest issue which I see:
result = cassandra_db_session.execute('select * from "your-keyspace"."your-table";')
The code looks fine, but I don't see a WHERE clause. So if there's a lot of data, a single node (chosen as a coordinator) will have to build the result set while pulling data from all other nodes. As this results in (un)predictibly bad performance, that could explain why it works sometimes, but not others.
Pro-tip: All queries in Cassandra should have a WHERE clause.

How to fix Firebird error Database already opened with engine instance, incompatible with current'

I have a Flask app with using flask_sqlalchemy:
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
app = Flask(__name__)
app.config.from_pyfile(filename='settings.py', silent=True)
db = SQLAlchemy(app=app)
I want connect to same database from daemon. In daemon I just import db and use db.engine.execute for SQLAlchemy queries.
But when daemon starts main app can't connect to database.
In log I see that:
fdb.fbcore.DatabaseError: ('Error while connecting to database:\n- SQLCODE:
-902\n- I/O error during "lock" operation for file "main.fdb"\n- Database
already opened with engine instance, incompatible with current', -902,
335544344)
I trying use isolation level:
from fdb.fbcore import ISOLATION_LEVEL_READ_COMMITED_LEGACY
class TPBAlchemy(SQLAlchemy):
def apply_driver_hacks(self, app_, info, options):
if 'isolation_level' not in options:
options['isolation_level'] = ISOLATION_LEVEL_READ_COMMITED_LEGACY
return super(TPBAlchemy, self).apply_driver_hacks(app_, info, options)
And replace this:
db = SQLAlchemy()
To:
db = TPBAlchemy()
But this only make another error:
TypeError: Invalid argument(s) 'isolation_level' sent to create_engine(),
using configuration FBDialect_fdb/QueuePool/Engine. Please check that the
keyword arguments are appropriate for this combination of components.
I would appreciate the full example to address my issue.
Your connection string is for an embedded database. You're only allowed to have one 'connection' to an embedded database at a time.
If you have the Loopback provider enabled you can change your connection string to something like:
localhost:/var/www/main.fdb
or if you have the Remote provider enabled you will have to access your database from another remote node, and assuming your Firebird server lives on 192.168.1.100 change your connection string to
192.168.1.100:/var/www/main.fdb
If you're intending to use the Engine12 provider (the embedded provider), then you have to stop whatever is already connected to that database because you just can't do two simultaneously users with this provider.
Also, try to set up some database aliases so you aren't specifying a database explicitly like that. In Firebird 3.0.3 check out databases.conf, where you can do something like:
mydatabasealias=/var/www/main.fdb
and your connection string would now be mydatabasealias instead of the whole path.

SQLAlchemy error MySQL server has gone away

Error OperationalError: (OperationalError) (2006, 'MySQL server has gone away') i'm already received this error when i coded project on Flask, but i cant understand why i get this error.
I have code (yeah, if code small and executing fast, then no errors) like this \
db_engine = create_engine('mysql://root#127.0.0.1/mind?charset=utf8', pool_size=10, pool_recycle=7200)
Base.metadata.create_all(db_engine)
Session = sessionmaker(bind=db_engine, autoflush=True)
Session = scoped_session(Session)
session = Session()
# there many classes and functions
session.close()
And this code returns me error 'MySQL server has gone away', but return it after some time, when i use pauses in my script.
Mysql i use from openserver.ru (it's web server like such as wamp).
Thanks..
Looking at the mysql docs, we can see that there are a bunch of reasons why this error can occur. However, the two main reasons I've seen are:
1) The most common reason is that the connection has been dropped because it hasn't been used in more than 8 hours (default setting)
By default, the server closes the connection after eight hours if nothing has happened. You can change the time limit by setting the wait_timeout variable when you start mysqld
I'll just mention for completeness the two ways to deal with that, but they've already been mentioned in other answers:
A: I have a very long running job and so my connection is stale. To fix this, I refresh my connection:
create_engine(conn_str, pool_recycle=3600) # recycle every hour
B: I have a long running service and long periods of inactivity. To fix this I ping mysql before every call:
create_engine(conn_str, pool_pre_ping=True)
2) My packet size is too large, which should throw this error:
_mysql_exceptions.OperationalError: (1153, "Got a packet bigger than 'max_allowed_packet' bytes")
I've only seen this buried in the middle of the trace, though often you'll only see the generic _mysql_exceptions.OperationalError (2006, 'MySQL server has gone away'), so it's hard to catch, especially if logs are in multiple places.
The above doc say the max packet size is 64MB by default, but it's actually 16MB, which can be verified with SELECT ##max_allowed_packet
To fix this, decrease packet size for INSERT or UPDATE calls.
SQLAlchemy now has a great write-up on how you can use pinging to be pessimistic about your connection's freshness:
http://docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-pessimistic
From there,
from sqlalchemy import exc
from sqlalchemy import event
from sqlalchemy.pool import Pool
#event.listens_for(Pool, "checkout")
def ping_connection(dbapi_connection, connection_record, connection_proxy):
cursor = dbapi_connection.cursor()
try:
cursor.execute("SELECT 1")
except:
# optional - dispose the whole pool
# instead of invalidating one at a time
# connection_proxy._pool.dispose()
# raise DisconnectionError - pool will try
# connecting again up to three times before raising.
raise exc.DisconnectionError()
cursor.close()
And a test to make sure the above works:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger#localhost/test", echo_pool=True)
c1 = e.connect()
c2 = e.connect()
c3 = e.connect()
c1.close()
c2.close()
c3.close()
# pool size is now three.
print "Restart the server"
raw_input()
for i in xrange(10):
c = e.connect()
print c.execute("select 1").fetchall()
c.close()
from documentation you can use pool_recycle parameter:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger#localhost/test", pool_recycle=3600)
I just faced the same problem, which is solved with some effort. Wish my experience be helpful to others.
Fallowing some suggestions, I used connection pool and set pool_recycle less than wait_timeout, but it still doesn't work.
Then, I realized that global session maybe just use the same connection and connection pool didn't work. To avoid global session, for each request generate a new session which is removed by Session.remove() after processing.
Finally, all is well.
One more point to keep in mind is to manually push the flask application context with database initialization. This should resolve the issue.
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
app = Flask(__name__)
with app.app_context():
db.init_app(app)
https://docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-optimistic
def sql_read(cls, sql, connection):
"""sql for read action like select
"""
LOG.debug(sql)
try:
result = connection.engine.execute(sql)
header = result.keys()
for row in result:
yield dict(zip(header, row))
except OperationalError as e:
LOG.info("recreate pool duo to %s" % e)
connection.engine.pool.recreate()
result = connection.engine.execute(sql)
header = result.keys()
for row in result:
yield dict(zip(header, row))
except Exception as ee:
LOG.error(ee)
raise SqlExecuteError()

Python and Django OperationalError (2006, 'MySQL server has gone away')

Original: I have recently started getting MySQL OperationalErrors from some of my old code and cannot seem to trace back the problem. Since it was working before, I thought it may have been a software update that broke something. I am using python 2.7 with django runfcgi with nginx. Here is my original code:
views.py
DBNAME = "test"
DBIP = "localhost"
DBUSER = "django"
DBPASS = "password"
db = MySQLdb.connect(DBIP,DBUSER,DBPASS,DBNAME)
cursor = db.cursor()
def list(request):
statement = "SELECT item from table where selected = 1"
cursor.execute(statement)
results = cursor.fetchall()
I have tried the following, but it still does not work:
views.py
class DB:
conn = None
DBNAME = "test"
DBIP = "localhost"
DBUSER = "django"
DBPASS = "password"
def connect(self):
self.conn = MySQLdb.connect(DBIP,DBUSER,DBPASS,DBNAME)
def cursor(self):
try:
return self.conn.cursor()
except (AttributeError, MySQLdb.OperationalError):
self.connect()
return self.conn.cursor()
db = DB()
cursor = db.cursor()
def list(request):
cursor = db.cursor()
statement = "SELECT item from table where selected = 1"
cursor.execute(statement)
results = cursor.fetchall()
Currently, my only workaround is to do MySQLdb.connect() in each function that uses mysql. Also I noticed that when using django's manage.py runserver, I would not have this problem while nginx would throw these errors. I doubt that I am timing out with the connection because list() is being called within seconds of starting the server up. Were there any updates to the software I am using that would cause this to break/is there any fix for this?
Edit: I realized that I recently wrote a piece of middle-ware to daemonize a function and this was the cause of the problem. However, I cannot figure out why. Here is the code for the middle-ware
def process_request_handler(sender, **kwargs):
t = threading.Thread(target=dispatch.execute,
args=[kwargs['nodes'],kwargs['callback']],
kwargs={})
t.setDaemon(True)
t.start()
return
process_request.connect(process_request_handler)
Sometimes if you see "OperationalError: (2006, 'MySQL server has gone away')", it is because you are issuing a query that is too large. This can happen, for instance, if you're storing your sessions in MySQL, and you're trying to put something really big in the session. To fix the problem, you need to increase the value of the max_allowed_packet setting in MySQL.
The default value is 1048576.
So see the current value for the default, run the following SQL:
select ##max_allowed_packet;
To temporarily set a new value, run the following SQL:
set global max_allowed_packet=10485760;
To fix the problem more permanently, create a /etc/my.cnf file with at least the following:
[mysqld]
max_allowed_packet = 16M
After editing /etc/my.cnf, you'll need to restart MySQL or restart your machine if you don't know how.
As per the MySQL documentation, your error message is raised when the client can't send a question to the server, most likely because the server itself has closed the connection. In the most common case the server will close an idle connection after a (default) of 8 hours. This is configurable on the server side.
The MySQL documentation gives a number of other possible causes which might be worth looking into to see if they fit your situation.
An alternative to calling connect() in every function (which might end up needlessly creating new connections) would be to investigate using the ping() method on the connection object; this tests the connection with the option of attempting an automatic reconnect. I struggled to find some decent documentation for the ping() method online, but the answer to this question might help.
Note, automatically reconnecting can be dangerous when handling transactions as it appears the reconnect causes an implicit rollback (and appears to be the main reason why autoreconnect is not a feature of the MySQLdb implementation).
This might be due to DB connections getting copied in your child threads from the main thread. I faced the same error when using python's multiprocessing library to spawn different processes. The connection objects are copied between processes during forking and it leads to MySQL OperationalErrors when making DB calls in the child thread.
Here's a good reference to solve this: Django multiprocessing and database connections
For me this was happening in debug mode.
So I tried Persistent connections in debug mode, checkout the link: Django - Documentation - Databases - Persistent connections.
In settings:
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'dbname',
'USER': 'root',
'PASSWORD': 'root',
'HOST': 'localhost',
'PORT': '3306',
'CONN_MAX_AGE': None
},
Check if you are allowed to create mysql connection object in one thread and then use it in another.
If it's forbidden, use threading.Local for per-thread connections:
class Db(threading.local):
""" thread-local db object """
con = None
def __init__(self, ...options...):
super(Db, self).__init__()
self.con = MySQLdb.connect(...options...)
db1 = Db(...)
def test():
"""safe to run from any thread"""
cursor = db.con.cursor()
cursor.execute(...)
This error is mysterious because MySQL doesn't report why it disconnects, it just goes away.
It seems there are many causes of this kind of disconnection. One I just found is, if the query string too large, the server will disconnect. This probably relates to the max_allowed_packets setting.
I've been struggling with this issue too. I don't like the idea of increasing timeout on mysqlserver. Autoreconnect with CONNECTION_MAX_AGE doesn't work either as it was mentioned. Unfortunately I ended up with wrapping every method that queries the database like this
def do_db( callback, *arg, **args):
try:
return callback(*arg, **args)
except (OperationalError, InterfaceError) as e: # Connection has gone away, fiter it with message or error code if you could catch another errors
connection.close()
return callback(*arg, **args)
do_db(User.objects.get, id=123) # instead of User.objects.get(id=123)
As you can see I rather prefer catching the exception than pinging the database every time before querying it. Because catching an exception is a rare case. I would expect django to reconnect automatically but they seemed to refused that issue.
This error may occur when you try to use the connection after a time-consuming operation that doesn't go to the database. Since the connection is not used for some time, MySQL timeout is hit and the connection is silently dropped.
You can try calling close_old_connections() after the time-consuming non-DB operation so that a new connection is opened if the connection is unusable. Beware, do not use close_old_connections() if you have a transaction.
The most common issue regarding such warning, is the fact that your application has reached the wait_timeout value of MySQL.
I had the same problem with a Flask app.
Here's how I solved:
$ grep timeout /etc/mysql/mysql.conf.d/mysqld.cnf
# https://support.rackspace.com/how-to/how-to-change-the-mysql-timeout-on-a-server/
# wait = timeout for application session (tdm)
# inteactive = timeout for keyboard session (terminal)
# 7 days = 604800s / 4 hours = 14400s
wait_timeout = 604800
interactive_timeout = 14400
Observation: if you search for the variables via MySQL batch mode, the values will appear as it is. But If you perform SHOW VARIABLES LIKE 'wait%'; or SHOW VARIABLES LIKE 'interactive%';, the value configured for interactive_timeout, will appear to both variables, and I don't know why, but the fact is, that the values configured for each variable at '/etc/mysql/mysql.conf.d/mysqld.cnf', will be respected by MySQL.
How old is this code? Django has had databases defined in settings since at least .96. Only other thing I can think of is multi-db support, which changed things a bit, but even that was 1.1 or 1.2.
Even if you need a special DB for certain views, I think you'd probably be better off defining it in settings.
SQLAlchemy now has a great write-up on how you can use pinging to be pessimistic about your connection's freshness:
http://docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-pessimistic
From there,
from sqlalchemy import exc
from sqlalchemy import event
from sqlalchemy.pool import Pool
#event.listens_for(Pool, "checkout")
def ping_connection(dbapi_connection, connection_record, connection_proxy):
cursor = dbapi_connection.cursor()
try:
cursor.execute("SELECT 1")
except:
# optional - dispose the whole pool
# instead of invalidating one at a time
# connection_proxy._pool.dispose()
# raise DisconnectionError - pool will try
# connecting again up to three times before raising.
raise exc.DisconnectionError()
cursor.close()
And a test to make sure the above works:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger#localhost/test", echo_pool=True)
c1 = e.connect()
c2 = e.connect()
c3 = e.connect()
c1.close()
c2.close()
c3.close()
# pool size is now three.
print "Restart the server"
raw_input()
for i in xrange(10):
c = e.connect()
print c.execute("select 1").fetchall()
c.close()
I had this problem and did not have the option to change my configuration. I finally figured out that the problem was occurring 49500 records in to my 50000-record loop, because that was the about the time I was trying again (after having tried a long time ago) to hit my second database.
So I changed my code so that every few thousand records, I touched the second database again (with a count() of a very small table), and that fixed it. No doubt "ping" or some other means of touching the database would work, as well.
Firstly, You should make sure the MySQL session & global enviroments wait_timeout and interactive_timeout values. And secondly Your client should try to reconnect to the server below those enviroments values.

Twisted web service - sql connection drops

I am working on a web service with Twisted that is responsible for calling up several packages I had previously used on the command line. The routines these packages handle were being prototyped on their own but now are ready to be integrated into our webservice.
In short, I have several different modules that all create a mysql connection property internally in their original command line forms. Take this for example:
class searcher:
def __init__(self,lat,lon,radius):
self.conn = getConnection()[1]
self.con=self.conn.cursor();
self.mgo = getConnection(True)
self.lat = lat
self.lon = lon
self.radius = radius
self.profsinrange()
self.cache = memcache.Client(["173.220.194.84:11211"])
The getConnection function is just a helper that returns a mongo or mysql cursor respectively. Again, this is all prototypical :)
The problem I am experiencing is when implemented as a consistently running server using Twisted's WSGI resource, the sql connection created in init times out, and subsequent requests don't seem to regenerate it. Example code for small server app:
from twisted.web import server
from twisted.web.wsgi import WSGIResource
from twisted.python.threadpool import ThreadPool
from twisted.internet import reactor
from twisted.application import service, strports
import cgi
import gnengine
import nn
wsgiThreadPool = ThreadPool()
wsgiThreadPool.start()
# ensuring that it will be stopped when the reactor shuts down
reactor.addSystemEventTrigger('after', 'shutdown', wsgiThreadPool.stop)
def application(environ, start_response):
start_response('200 OK', [('Content-type','text/plain')])
params = cgi.parse_qs(environ['QUERY_STRING'])
try:
lat = float(params['lat'][0])
lon = float(params['lon'][0])
radius = int(params['radius'][0])
query_terms = params['query']
s = gnengine.searcher(lat,lon,radius)
query_terms = ' '.join( query_terms )
json = s.query(query_terms)
return [json]
except Exception, e:
return [str(e),str(params)]
return ['error']
wsgiAppAsResource = WSGIResource(reactor, wsgiThreadPool, application)
# Hooks for twistd
application = service.Application('Twisted.web.wsgi Hello World Example')
server = strports.service('tcp:8080', server.Site(wsgiAppAsResource))
server.setServiceParent(application)
The first few requests work fine, but after mysqls wait_timeout expires, the dread error 2006 "Mysql has gone away" error surfaces. It had been my understanding that every request to the WSGI Twisted resource would run the application function, thereby regenerating the searcher object and re-leasing the connection. If this isn't the case, how can I make the requests processed as such? Is this kind of Twisted deployment not transactional in this sense? Thanks!
EDIT: Per request, here is the prototype helper function calling up the connection:
def getConnection(mong = False):
if mong == False:
connection = mysql.connect(host = db_host,
user = db_user,
passwd = db_pass,
db = db,
cursorclass=mysql.cursors.DictCursor)
cur = connection.cursor();
return (cur,connection)
else:
return pymongo.Connection('173.220.194.84',27017).gonation_test
i was developing a piece of software with twisted where i had to utilize a constant MySQL database connection. i did run into this problem and digging through the twisted documentation extensively and posting a few questions i was unable to find a proper solution.There is a boolean parameter you can pass when you are instantiating the adbapi.connectionPool class; however it never seemed to work and i kept getting the error irregardless. However, what i am guessing the reconnect boolean represents is the destruction of the connection object when SQL disconnect does occur.
adbapi.ConnectionPool("MySQLdb", cp_reconnect=True, host="", user="", passwd="", db="")
I have not tested this but i will re-post some results when i do or if anyone else has please share.
When i was developing the script i was using twisted 8.2.0 (i havent touched twisted in a while) and back then the framework had no such explicit keep alive method, so i developed a ping/keepalive extension employing event driven paradigm twisted builds upon in conjunction with direct MySQLdb module ping() method (see code comment).
As i was typing this response; however, i did look around the current twisted documentation i was still unable to find an explicit keep-alive method or parameter. My guess is because twisted itself does not have database connectivity libraries/classes. It uses the methods available to python and provides an indirect layer of interfacing with those modules; with some exposure for direct calls to the database library being used. This is accomplished by using the adbapi.runWithConnection method.
here is the module i wrote under twisted 8.2.0 and python 2.6; you can set the intervals between pings. what the script does is, every 20 minutes it pings the database and if it fails, it attempts to reconnect back to it every 60 seconds. I must warn that the script does NOT handle sudden/dropped connection; that you can handle through addErrback whenever you run a query through twisted, atleast thats how i did it. I have noticed that whenever database connection drops, you can only find out if it has when you are executing a query and the event raises an errback, and then at that point you deal with it. Basically, if i dont run a query for 10 minutes, and my database disconnects me, my application will not respond in real time. the application will realize the connection has been dropped when it runs the query that follows; so the database could have disconnected us 1 minute after the first query, 5, 9, etc....
I guess this sort of goes back to the original idea that i have stated, twisted utilizes python's own libraries or 3rd party libraries for database connectivity and because of that, some things are handled a bit differently.
from twisted.enterprise import adbapi
from twisted.internet import reactor, defer, task
class sqlClass:
def __init__(self, db_pointer):
self.dbpool=db_pointer
self.dbping = task.LoopingCall(self.dbping)
self.dbping.start(1200) #20 minutes = 1200 seconds; i found out that if MySQL socket is idled for 20 minutes or longer, MySQL itself disconnects the session for security reasons; i do believe you can change that in the configuration of the database server itself but it may not be recommended.
self.reconnect=False
print "database ping initiated"
def dbping(self):
def ping(conn):
conn.ping() #what happens here is that twisted allows us to access methods from the MySQLdb module that python posesses; i chose to use the native command instead of sending null commands to the database.
pingdb=self.dbpool.runWithConnection(ping)
pingdb.addCallback(self.dbactive)
pingdb.addErrback(self.dbout)
print "pinging database"
def dbactive(self, data):
if data==None and self.reconnect==True:
self.dbping.stop()
self.reconnect=False
self.dbping.start(1200) #20 minutes = 1200 seconds
print "Reconnected to database!"
elif data==None:
print "database is active"
def dbout(self, deferr):
#print deferr
if self.reconnect==False:
self.dbreconnect()
elif self.reconnect==True:
print "Unable to reconnect to database"
print "unable to ping MySQL database!"
def dbreconnect(self, *data):
self.dbping.stop()
self.reconnect=True
#self.dbping = task.LoopingCall(self.dbping)
self.dbping.start(60) #60
if __name__ == "__main__":
db = sqlClass(adbapi.ConnectionPool("MySQLdb", cp_reconnect=True, host="", user="", passwd="", db=""))
reactor.callLater(2, db.dbping)
reactor.run()
let me know how it works out for you :)

Categories

Resources