SQLAlchemy session.refresh() spawns new connections - python

I am using sqlalchemy for orm in my project. My problem is that every time i use session.refresh(obj) new db connection is used which are held until session.close() is called.
So when i want to refresh multiple objects i quickly run out of connections.
Session maker:
session = session_maker()
try:
yield session
session.commit()
for obj in session:
session.refresh(obj)
except Exception as e:
session.rollback()
raise e
finally:
session.close()
Usage:
with make_session(...) as session:
for mapped in [self._mapper.map(obj) for obj in objects]:
saved_entities.append(mapped)
session.add(mapped)
session.flush()
I am using refresh because i have columns that are filled on update and I want to return current values.
The curious thing is that when i do that:
for obj in session:
session.commit()
session.refresh(obj)
only two connections are used (which is fine) but the objects have no data.

Use scoped_session, see http://docs.sqlalchemy.org/en/latest/orm/contextual.html
If you do, you will get the same session (connection ID) each time you request it. Also, you don't need to call refresh(): add() and flush() should be enough, the updated values should be available after the flush() and before the commit(), but only if you're using the same session ID (database transaction) to look them up (hence, you need a scoped_session).

Related

SQLAlchemy: Can't reconnect until invalid transaction is rolled back

I have a weird problem.
I have a simple py3 app, which uses sqlalchemy.
But several hours later, there is an error:
(sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back
My init part:
self.db_engine = create_engine(self.db_config, pool_pre_ping=True) # echo=True if needed to see background SQL
Session = sessionmaker(bind=self.db_engine)
self.db_session = Session()
The query (this is the only query that happens):
while True:
device_id = self.db_session.query(Device).filter(Device.owned_by == msg['user_id']).first()
sleep(20)
The whole script is in infinite loop, single threaded (SQS reading out). Does anybody cope with this problem?
The solution:
don't let your connection open a long time.
SQLAlchemy documentation also shares the same solution: session basics
#contextmanager
def session_scope(self):
self.db_engine = create_engine(self.db_config, pool_pre_ping=True) # echo=True if needed to see background SQL
Session = sessionmaker(bind=self.db_engine)
session = Session()
try:
# this is where the "work" happens!
yield session
# always commit changes!
session.commit()
except:
# if any kind of exception occurs, rollback transaction
session.rollback()
raise
finally:
session.close()

ZODB commit stuck and make my whole application to freeze

Today I found a bug on my python application using ZODB.
Trying to find why my application freezes up, I figured that ZODB was the cause.
Setting the logging to debug, it seem that when commiting, that ZODB would find 2 connections and then start freezing.
INFO:ZEO.ClientStorage:('127.0.0.1', 8092) Connected to storage: ('localhost', 8092)
DEBUG:txn.140661100980032:new transaction
DEBUG:txn.140661100980032:commit
DEBUG:ZODB.Connection:Committing savepoints of size 1858621925
DEBUG:discord.gateway:Keeping websocket alive with sequence 59.
DEBUG:txn.140661100980032:commit <Connection at 7fee2d080fd0>
DEBUG:txn.140661100980032:commit <Connection at 7fee359e5cc0>
As I'm a ZODB beginner, any idea on a how to solve / how to dig deeper ?
It seems to be related to concurrent commits.
I believed that opening a new connection would initiate a dedicated transaction manager, but this is not the case. While initiating a new connection without specifying a transaction manager, the local one (shared with other connections on the thread) is used.
My code:
async def get_connection():
return ZEO.connection(8092)
async def _message_db_init_aux(self, channel, after=None, before=None):
connexion = await get_connection()
root = connexion.root()
messages = await some_function_which_return_a_list()
async for message in messages:
# If author.id doesn't exist on the data, let's initiate it as a Tree
if message.author.id not in root.data: # root.data is a BTrees.OOBTree.BTree()
root.data[message.author.id] = BTrees.OOBTree.BTree()
# Message is a defined classed inherited from persistant.Persistant
root.data[message.author.id][message.id] = Message(message.id, message.author.id, message.created_at)
transaction.commit()
connexion.close()
Don't re-use transaction managers across connections. Each connection has its own transaction manager, use that.
Your code currently creates the connection, then commits. Rather than create the connection, ask the database to create a transaction manager for you, which then manages its own connection. The transaction manager can be used as a context manager, meaning that changes to the database are automatically committed when the context ends.
Moreover, by using ZEO.connection() for each transaction, you are forcing ZEO to create a complete new client object, with a fresh cache and connection pool. By using ZEO.DB() instead, and caching the result, a single client is created from which connections can be pooled and reused, and with a local cache to speed up transactions.
I'd alter the code to:
def get_db():
"""Access the ZEO database client.
The database client is cached to take advantage of caching and connection pooling
"""
db = getattr(get_db, 'db', None)
if db is None:
get_db.db = db = ZEO.DB(8092)
return db
async def _message_db_init_aux(self, channel, after=None, before=None):
with self.get_db().transaction() as conn:
root = conn.root()
messages = await some_function_which_return_a_list()
async for message in messages:
# If author.id doesn't exist on the data, let's initiate it as a Tree
if message.author.id not in root.data: # root.data is a BTrees.OOBTree.BTree()
root.data[message.author.id] = BTrees.OOBTree.BTree()
# Message is a defined classed inherited from persistant.Persistant
root.data[message.author.id][message.id] = Message(
message.id, message.author.id, message.created_at
)
The .transaction() method on the database object creates a new connection under the hood, the moment the context is entered (with causing __enter__ to be called), and when the with block ends the transaction is committed and the connection is released to the pool again.
Note that I used a synchronous def get_db() method; the call signatures on the ZEO client code are entirely synchronous. They are safe to call from asynchronous code because under the hood, the implementation uses asyncio throughout, using callbacks and tasks on the same loop, and actual I/O is deferred to separate tasks.
When not precised, the local transaction manager is used.
If you open multiple connections on the same thread, you have to precise the transaction manager you want to use. By default
transaction.commit()
is the local transaction manager.
connection.transaction.manager.commit()
will use the transaction manager dedicated to the transaction (and not the local one).
For more informations, check http://www.zodb.org/en/latest/guide/transactions-and-threading.html

Does this thread-local Flask-SQLAchemy session cause a "MySQL server has gone away" error?

I have a web application that runs long jobs that are independent of user sessions. To achieve this, I have an implementation for a thread-local Flask-SQLAlchemy session. The problem is a few times a day, I get a MySQL server has gone away error when I visit my site. The site always loads upon refresh. I think the issue is related to these thread-local sessions, but I'm not sure.
This is my implementation of a thread-local session scope:
#contextmanager
def thread_local_session_scope():
"""Provides a transactional scope around a series of operations.
Context is local to current thread.
"""
# See this StackOverflow answer for details:
# http://stackoverflow.com/a/18265238/1830334
Session = scoped_session(session_factory)
threaded_session = Session()
try:
yield threaded_session
threaded_session.commit()
except:
threaded_session.rollback()
raise
finally:
Session.remove()
And here is my standard Flask-SQLAlchemy session:
#contextmanager
def session_scope():
"""Provides a transactional scope around a series of operations.
Context is HTTP request thread using Flask-SQLAlchemy.
"""
try:
yield db.session
db.session.commit()
except Exception as e:
print 'Rolling back database'
print e
db.session.rollback()
# Flask-SQLAlchemy handles closing the session after the HTTP request.
Then I use both session context managers like this:
def build_report(tag):
report = _save_report(Report())
thread = Thread(target=_build_report, args=(report.id,))
thread.daemon = True
thread.start()
return report.id
# This executes in the main thread.
def _save_report(report):
with session_scope() as session:
session.add(report)
session.commit()
return report
# These executes in a separate thread.
def _build_report(report_id):
with thread_local_session_scope() as session:
report = do_some_stuff(report_id)
session.merge(report)
EDIT: Engine configurations
app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql://<username>:<password>#<server>:3306/<db>?charset=utf8'
app.config['SQLALCHEMY_POOL_RECYCLE'] = 3600
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
Try adding an
app.teardown_request(Exception=None)
Decorator, which executes at the end of each request. I am currently experiencing a similar issue, and it seems as if today I have actually resolved it using.
#app.teardown_request
def teardown_request(exception=None):
Session.remove()
if exception and Session.is_active:
print(exception)
Session.rollback()
I do not use Flask-SQLAlchemy Only Raw SQLAlchemy, so it may have differences for you.
From the Docs
The teardown callbacks are special callbacks in that they are executed
at at different point. Strictly speaking they are independent of the
actual request handling as they are bound to the lifecycle of the
RequestContext object. When the request context is popped, the
teardown_request() functions are called.
In my case, I open a new scoped_session for each request, requiring me to remove it at the end of each request (Flask-SQLAlchemy may not need this). Also, the teardown_request function is passed an Exception if one occured during the context. In this scenario, if an exception occured (possibly causing the transaction to not be removed, or need a rollback), we check if there was an exception, and rollback.
If this doesnt work for my own testing, the next thing I was going to do was a session.commit() at each teardown, just to make sure everything is flushing
UPDATE : it also appears MySQL invalidates connections after 8 hours, causing the Session to be corrupted.
set pool_recycle=3600 on your engine configuration, or to a setting < MySQL timeout. This in conjunction with proper session scoping (closing sessions) should do it.

Why does SQLAlchemy/mysql keep timing out on me?

I have 2 functions that need to be executed and the first takes about 4 hours to execute. Both use SQLAlchemy:
def first():
session = DBSession
rows = session.query(Mytable).order_by(Mytable.col1.desc())[:150]
for i,row in enumerate(rows):
time.sleep(100)
print i, row.accession
def second():
print "going onto second function"
session = DBSession
new_row = session.query(Anothertable).order_by(Anothertable.col1.desc()).first()
print 'New Row: ', new_row.accession
first()
second()
And here is how I define DBSession:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy import create_engine
engine = create_engine('mysql://blah:blah#blah/blahblah',echo=False,pool_recycle=3600*12)
DBSession = scoped_session(sessionmaker(autocommit=False, autoflush=False, bind=engine))
Base = declarative_base()
Base.metadata.bind = engine
first() finishes fine (takes about 4 hrs) and I see "going onto second function" printed then it immediately gives me an error:
sqlalchemy.exc.OperationalError: (OperationalError) (2006, 'MySQL server has gone away')
From reading the docs I thought assigning session=DBSession would get two different session instances and so that second() wouldn't timeout. I've also tried playing with pool_recycle and that doesn't seem to have any effect here. In the real world, I can't split first() and second() into 2 scripts: second() has to execute immediately after first()
Your engine (not session) keeps a pool of connections. When a mysql connection has not been used for several hours, mysql server closes the socket, this causes a "Mysql server has gone away" error when you try to use this connection. If you have a simple single-threaded script then calling create_engine with pool_size=1 will probably do the trick. If not, you can use events to ping the connection when it is checked out of the pool. This great answer has all the details:
SQLAlchemy error MySQL server has gone away
assigning session=DBSession would get two different session instances
That simply isn't true. session = DBSession is a local variable assignment, and you cannot override local variable assignment in Python (you can override instance member assignment, but that's unrelated).
Another thing to note is that scoped_session produces, by default, a thread-local scoped session (i.e. all codes in the same thread all have the same session). Since you call first() and second() in the same thread, they are one and the same session.
One thing you can do is to use regular (unscoped) session, just manage your session scope manually and create a new session in both function. Alternatively, you can check the doc about how to define custom session scope.
It doesn't look like you're getting separate Session instances. If the first query is successfully committing, then your Session could be expiring after that commit.
Try setting auto-expire to false for your session:
DBSession = scoped_session(sessionmaker(expire_on_commit=False, autocommit=False, autoflush=False, bind=engine))
and then commit later.

SQLAlchemy error MySQL server has gone away

Error OperationalError: (OperationalError) (2006, 'MySQL server has gone away') i'm already received this error when i coded project on Flask, but i cant understand why i get this error.
I have code (yeah, if code small and executing fast, then no errors) like this \
db_engine = create_engine('mysql://root#127.0.0.1/mind?charset=utf8', pool_size=10, pool_recycle=7200)
Base.metadata.create_all(db_engine)
Session = sessionmaker(bind=db_engine, autoflush=True)
Session = scoped_session(Session)
session = Session()
# there many classes and functions
session.close()
And this code returns me error 'MySQL server has gone away', but return it after some time, when i use pauses in my script.
Mysql i use from openserver.ru (it's web server like such as wamp).
Thanks..
Looking at the mysql docs, we can see that there are a bunch of reasons why this error can occur. However, the two main reasons I've seen are:
1) The most common reason is that the connection has been dropped because it hasn't been used in more than 8 hours (default setting)
By default, the server closes the connection after eight hours if nothing has happened. You can change the time limit by setting the wait_timeout variable when you start mysqld
I'll just mention for completeness the two ways to deal with that, but they've already been mentioned in other answers:
A: I have a very long running job and so my connection is stale. To fix this, I refresh my connection:
create_engine(conn_str, pool_recycle=3600) # recycle every hour
B: I have a long running service and long periods of inactivity. To fix this I ping mysql before every call:
create_engine(conn_str, pool_pre_ping=True)
2) My packet size is too large, which should throw this error:
_mysql_exceptions.OperationalError: (1153, "Got a packet bigger than 'max_allowed_packet' bytes")
I've only seen this buried in the middle of the trace, though often you'll only see the generic _mysql_exceptions.OperationalError (2006, 'MySQL server has gone away'), so it's hard to catch, especially if logs are in multiple places.
The above doc say the max packet size is 64MB by default, but it's actually 16MB, which can be verified with SELECT ##max_allowed_packet
To fix this, decrease packet size for INSERT or UPDATE calls.
SQLAlchemy now has a great write-up on how you can use pinging to be pessimistic about your connection's freshness:
http://docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-pessimistic
From there,
from sqlalchemy import exc
from sqlalchemy import event
from sqlalchemy.pool import Pool
#event.listens_for(Pool, "checkout")
def ping_connection(dbapi_connection, connection_record, connection_proxy):
cursor = dbapi_connection.cursor()
try:
cursor.execute("SELECT 1")
except:
# optional - dispose the whole pool
# instead of invalidating one at a time
# connection_proxy._pool.dispose()
# raise DisconnectionError - pool will try
# connecting again up to three times before raising.
raise exc.DisconnectionError()
cursor.close()
And a test to make sure the above works:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger#localhost/test", echo_pool=True)
c1 = e.connect()
c2 = e.connect()
c3 = e.connect()
c1.close()
c2.close()
c3.close()
# pool size is now three.
print "Restart the server"
raw_input()
for i in xrange(10):
c = e.connect()
print c.execute("select 1").fetchall()
c.close()
from documentation you can use pool_recycle parameter:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger#localhost/test", pool_recycle=3600)
I just faced the same problem, which is solved with some effort. Wish my experience be helpful to others.
Fallowing some suggestions, I used connection pool and set pool_recycle less than wait_timeout, but it still doesn't work.
Then, I realized that global session maybe just use the same connection and connection pool didn't work. To avoid global session, for each request generate a new session which is removed by Session.remove() after processing.
Finally, all is well.
One more point to keep in mind is to manually push the flask application context with database initialization. This should resolve the issue.
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
app = Flask(__name__)
with app.app_context():
db.init_app(app)
https://docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-optimistic
def sql_read(cls, sql, connection):
"""sql for read action like select
"""
LOG.debug(sql)
try:
result = connection.engine.execute(sql)
header = result.keys()
for row in result:
yield dict(zip(header, row))
except OperationalError as e:
LOG.info("recreate pool duo to %s" % e)
connection.engine.pool.recreate()
result = connection.engine.execute(sql)
header = result.keys()
for row in result:
yield dict(zip(header, row))
except Exception as ee:
LOG.error(ee)
raise SqlExecuteError()

Categories

Resources