ZODB commit stuck and make my whole application to freeze

ZODB commit stuck and make my whole application to freeze - python

Today I found a bug on my python application using ZODB.
Trying to find why my application freezes up, I figured that ZODB was the cause.
Setting the logging to debug, it seem that when commiting, that ZODB would find 2 connections and then start freezing.
INFO:ZEO.ClientStorage:('127.0.0.1', 8092) Connected to storage: ('localhost', 8092)
DEBUG:txn.140661100980032:new transaction
DEBUG:txn.140661100980032:commit
DEBUG:ZODB.Connection:Committing savepoints of size 1858621925
DEBUG:discord.gateway:Keeping websocket alive with sequence 59.
DEBUG:txn.140661100980032:commit <Connection at 7fee2d080fd0>
DEBUG:txn.140661100980032:commit <Connection at 7fee359e5cc0>
As I'm a ZODB beginner, any idea on a how to solve / how to dig deeper ?
It seems to be related to concurrent commits.
I believed that opening a new connection would initiate a dedicated transaction manager, but this is not the case. While initiating a new connection without specifying a transaction manager, the local one (shared with other connections on the thread) is used.
My code:
async def get_connection():
return ZEO.connection(8092)
async def _message_db_init_aux(self, channel, after=None, before=None):
connexion = await get_connection()
root = connexion.root()
messages = await some_function_which_return_a_list()
async for message in messages:
# If author.id doesn't exist on the data, let's initiate it as a Tree
if message.author.id not in root.data: # root.data is a BTrees.OOBTree.BTree()
root.data[message.author.id] = BTrees.OOBTree.BTree()
# Message is a defined classed inherited from persistant.Persistant
root.data[message.author.id][message.id] = Message(message.id, message.author.id, message.created_at)
transaction.commit()
connexion.close()

Don't re-use transaction managers across connections. Each connection has its own transaction manager, use that.
Your code currently creates the connection, then commits. Rather than create the connection, ask the database to create a transaction manager for you, which then manages its own connection. The transaction manager can be used as a context manager, meaning that changes to the database are automatically committed when the context ends.
Moreover, by using ZEO.connection() for each transaction, you are forcing ZEO to create a complete new client object, with a fresh cache and connection pool. By using ZEO.DB() instead, and caching the result, a single client is created from which connections can be pooled and reused, and with a local cache to speed up transactions.
I'd alter the code to:
def get_db():
"""Access the ZEO database client.
The database client is cached to take advantage of caching and connection pooling
"""
db = getattr(get_db, 'db', None)
if db is None:
get_db.db = db = ZEO.DB(8092)
return db
async def _message_db_init_aux(self, channel, after=None, before=None):
with self.get_db().transaction() as conn:
root = conn.root()
messages = await some_function_which_return_a_list()
async for message in messages:
# If author.id doesn't exist on the data, let's initiate it as a Tree
if message.author.id not in root.data: # root.data is a BTrees.OOBTree.BTree()
root.data[message.author.id] = BTrees.OOBTree.BTree()
# Message is a defined classed inherited from persistant.Persistant
root.data[message.author.id][message.id] = Message(
message.id, message.author.id, message.created_at
)
The .transaction() method on the database object creates a new connection under the hood, the moment the context is entered (with causing __enter__ to be called), and when the with block ends the transaction is committed and the connection is released to the pool again.
Note that I used a synchronous def get_db() method; the call signatures on the ZEO client code are entirely synchronous. They are safe to call from asynchronous code because under the hood, the implementation uses asyncio throughout, using callbacks and tasks on the same loop, and actual I/O is deferred to separate tasks.

When not precised, the local transaction manager is used.
If you open multiple connections on the same thread, you have to precise the transaction manager you want to use. By default
transaction.commit()
is the local transaction manager.
connection.transaction.manager.commit()
will use the transaction manager dedicated to the transaction (and not the local one).
For more informations, check http://www.zodb.org/en/latest/guide/transactions-and-threading.html

Related

Closing django ORM connections in a multi-threaded environment

I have below code in standalone script which is using django orm (outside django) with multithreading.
import threading
MAX_THREADS = 30
semaphore = threading.Semaphore(value=MAX_THREADS)
books = Books.objects.all()
for book in books:
book_id = book.id
t = threading.Thread(target=process_book, args=[book_id])
t.start()
threads.append(t)
for t in threads:
t.join()
def process_book(book_id):
semaphore.acquire()
book = Books.objects.get(id=book_id)
# Do some time taking stuff here
book.save()
semaphore.release()
Once the number of threads reaches MAX_CLIENT_CONN setting of postgres (which is 100 by default), I start getting following error with further calls:
operationalError at FATAL: remaining connection slots are reserved for
non-replication superuser connections
Researching this I got to solutions of using a database pooler like pgbouncer, However that only stalls the new connections until connections are available but again after django's query wat timeout I hit
OperationalError at / query_wait_timeout server closed the
connection unexpectedly This probably means the server terminated
abnormally before or while processing the request.
I understand that this is happening because the threads are not closing the db connections they are making but I am not sure how to even close the orm call connections? Is there something I could be doing differently in above code flow to reduce the number of connections?
I need to do a get on individual instances in order to update them because .update() or .save() wont work on queryset items.

this update the field in all the books on the database
for book in books:
Books.objects.filter(id=book.id).bulk_update(book, ['field to update'])
update each single book
def process_book(book_id):
semaphore.acquire()
book = get_object_or_404(Books, id=book_id).update(field) # Do some time taking stuff here
semaphore.release()

Just close the database connections at the end of your threads
from django import db
def process_book(book_id):
semaphore.acquire()
book = Books.objects.get(id=book_id)
# Do some time taking stuff here
book.save()
semaphore.release()
db.connections.close_all()

Acquiring pool connections in Python Gino (async)

I'm using Postgres, Python3.7 with asyncio + asyncpg + gino (ORM-ish) + aiohttp (routing, web responses).
I created a small postgres table users in my database testdb and inserted a single row:
testdb=# select * from users;
id | nickname
----+----------
1 | fantix
I'm trying to set up my database such that I can make use of the ORM within routes as requests come in.
import time
import asyncio
import gino
DATABASE_URL = os.environ.get('DATABASE_URL')
db = gino.Gino()
class User(db.Model):
__tablename__ = 'users'
id = db.Column(db.Integer(), primary_key=True)
nickname = db.Column(db.Unicode(), default='noname')
kwargs = dict(
min_size=10,
max_size=100,
max_queries=1000,
max_inactive_connection_lifetime=60 * 5,
echo=True
)
async def test_engine_implicit():
await db.set_bind(DATABASE_URL, **kwargs)
return await User.query.gino.all() # this works
async def test_engine_explicit():
engine = await gino.create_engine(DATABASE_URL, **kwargs)
db.bind = engine
async with engine.acquire() as conn:
return await conn.all(User.select()) # this doesn't work!
users = asyncio.get_event_loop().run_until_complete(test_engine_implicit())
print(f'implicit query: {users}')
users = asyncio.get_event_loop().run_until_complete(test_engine_explicit())
print(f'explicit query: {users}')
The output is:
web_1 | INFO gino.engine._SAEngine SELECT users.id, users.nickname FROM users
web_1 | INFO gino.engine._SAEngine ()
web_1 | implicit query: [<db.User object at 0x7fc57be42410>]
web_1 | INFO gino.engine._SAEngine SELECT
web_1 | INFO gino.engine._SAEngine ()
web_1 | explicit query: [()]
which is strange. The "explicit" code essentially runs a bare SELECT against the database, which is useless.
I can't find in the documentation a way to both 1) use the ORM, and 2) explicitly check out connections from the pool.
Questions I have:
Does await User.query.gino.all() check out a connection from the pool? How is it released?
How would I wrap queries in a transaction? I am uneasy that I am not able to explicitly control when / where I acquire a connection from the pool, and how I release it.
I'd essentially like the explicitness of the style in test_engine_explicit() to work with Gino, but perhaps I'm just not understanding how the Gino ORM works.

I have never used GINO before, but after a quick look into the code:
GINO connection simply executes provided clause as is. Thus, if you provide bare User.select(), then it adds nothing to that.
If you want to achieve the same as using User.query.gino.all(), but maintaining connection yourself, then you could follow the docs and use User.query instead of plain User.select():
async with engine.acquire() as conn:
return await conn.all(User.query)
Just tested and it works fine for me.
Regarding the connections pool, I am not sure that I got the question correctly, but Engine.acquire creates a reusable connection by default and then it is added to the pool, which is actually a stack:
:param reusable: Mark this connection as reusable or otherwise. This
has no effect if it is a reusing connection. All reusable connections
are placed in a stack, any reusing acquire operation will always
reuse the top (latest) reusable connection. One reusable connection
may be reused by several reusing connections - they all share one
same underlying connection. Acquiring a connection with
``reusable=False`` and ``reusing=False`` makes it a cleanly isolated
connection which is only referenced once here.
There is also a manual transaction control in GINO, so e.g. you can create a non-reusable, non-reuse connection and control transaction flow manually:
async with engine.acquire(reuse=False, reusable=False) as conn:
tx = await conn.transaction()
try:
await conn.status("INSERT INTO users(nickname) VALUES('e')")
await tx.commit()
except Exception:
await tx.rollback()
raise
As for connection release, I cannot find any evidence that GINO releases connections itself. I guess that pool is maintained by SQLAlchemy core.
I definitely have not answered your questions directly, but hope it will help you somehow.

Preservice a state in python spyne (like a db connection)

I am using python v3.5 with the package spyne 2.13 running on a gunicorn server v19.9
I wrote a small SOAP Webservice with python spyne (working well). It takes a string and enqueues it to rabbitmq. It must not neccessarily be rabbitmq, but also a simple DB insert oslt. Right now it works fine, but each time the webservice is called, it
opens a rabbitmq connection (or a DB connection if you'd like)
sends the message
closes the connection again(?)
I'd like to somehow preserve the connection in some sort of 'instance variable' and re-use it everytime the Webservice gets called. So that it connects only once and not everytime i call the ws. Unfortunately spyne does not seem to create any objects, so there are no instance variables.
Generally: How can I preserve a state (DB or RabbitMQ Connection) when using spyne?

So I tried this Trick with static class properties like so:
class Ws2RabbitMQ(ServiceBase):
rabbit_connection = pika.BlockingConnection(
pika.ConnectionParameters(host='localhost'))
rabbit_channel = rabbit_connection.channel()
#staticmethod
def connectRabbit():
rabbit_cred = pika.PlainCredentials(username='...', password='...')
Ws2RabbitMQ.rabbit_connection = pika.BlockingConnection(pika.ConnectionParameters(
host='...', virtual_host='...', credentials=rabbit_cred))
Ws2RabbitMQ.rabbit_channel = Ws2RabbitMQ.rabbit_connection.channel()
print('Rabbit connected!')
#rpc(AnyXml, _returns=Unicode)
def exportGRID(ctx, payload):
try:
if not Ws2RabbitMQ.rabbit_connection.is_open:
print('RabbitMQ Connection lost - reconnecting...')
Ws2RabbitMQ.connectRabbit()
except Exception as e:
print('RabbitMQ Connection not found - initiating...')
Ws2RabbitMQ.connectRabbit()
Ws2RabbitMQ.rabbit_channel.basic_publish(
exchange='ws2rabbitmq', routing_key="blind", body=payload)
print(" [x] Sent")
return 'OK'
When I call the webservice twice, it works. Now, the Connection is created only once and kept in the Singleton Property.
Here is the scripts output:
RabbitMQ Connection not found - initiating...
Rabbit connected!
[x] Sent
[x] Sent

Does this thread-local Flask-SQLAchemy session cause a "MySQL server has gone away" error?

I have a web application that runs long jobs that are independent of user sessions. To achieve this, I have an implementation for a thread-local Flask-SQLAlchemy session. The problem is a few times a day, I get a MySQL server has gone away error when I visit my site. The site always loads upon refresh. I think the issue is related to these thread-local sessions, but I'm not sure.
This is my implementation of a thread-local session scope:
#contextmanager
def thread_local_session_scope():
"""Provides a transactional scope around a series of operations.
Context is local to current thread.
"""
# See this StackOverflow answer for details:
# http://stackoverflow.com/a/18265238/1830334
Session = scoped_session(session_factory)
threaded_session = Session()
try:
yield threaded_session
threaded_session.commit()
except:
threaded_session.rollback()
raise
finally:
Session.remove()
And here is my standard Flask-SQLAlchemy session:
#contextmanager
def session_scope():
"""Provides a transactional scope around a series of operations.
Context is HTTP request thread using Flask-SQLAlchemy.
"""
try:
yield db.session
db.session.commit()
except Exception as e:
print 'Rolling back database'
print e
db.session.rollback()
# Flask-SQLAlchemy handles closing the session after the HTTP request.
Then I use both session context managers like this:
def build_report(tag):
report = _save_report(Report())
thread = Thread(target=_build_report, args=(report.id,))
thread.daemon = True
thread.start()
return report.id
# This executes in the main thread.
def _save_report(report):
with session_scope() as session:
session.add(report)
session.commit()
return report
# These executes in a separate thread.
def _build_report(report_id):
with thread_local_session_scope() as session:
report = do_some_stuff(report_id)
session.merge(report)
EDIT: Engine configurations
app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql://<username>:<password>#<server>:3306/<db>?charset=utf8'
app.config['SQLALCHEMY_POOL_RECYCLE'] = 3600
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False

Try adding an
app.teardown_request(Exception=None)
Decorator, which executes at the end of each request. I am currently experiencing a similar issue, and it seems as if today I have actually resolved it using.
#app.teardown_request
def teardown_request(exception=None):
Session.remove()
if exception and Session.is_active:
print(exception)
Session.rollback()
I do not use Flask-SQLAlchemy Only Raw SQLAlchemy, so it may have differences for you.
From the Docs
The teardown callbacks are special callbacks in that they are executed
at at different point. Strictly speaking they are independent of the
actual request handling as they are bound to the lifecycle of the
RequestContext object. When the request context is popped, the
teardown_request() functions are called.
In my case, I open a new scoped_session for each request, requiring me to remove it at the end of each request (Flask-SQLAlchemy may not need this). Also, the teardown_request function is passed an Exception if one occured during the context. In this scenario, if an exception occured (possibly causing the transaction to not be removed, or need a rollback), we check if there was an exception, and rollback.
If this doesnt work for my own testing, the next thing I was going to do was a session.commit() at each teardown, just to make sure everything is flushing
UPDATE : it also appears MySQL invalidates connections after 8 hours, causing the Session to be corrupted.
set pool_recycle=3600 on your engine configuration, or to a setting < MySQL timeout. This in conjunction with proper session scoping (closing sessions) should do it.

Django persistent database connection

I'm using django with apache and mod_wsgi and PostgreSQL (all on same host), and I need to handle a lot of simple dynamic page requests (hundreds per second). I faced with problem that the bottleneck is that a django don't have persistent database connection and reconnects on each requests (that takes near 5ms).
While doing a benchmark I got that with persistent connection I can handle near 500 r/s while without I get only 50 r/s.
Anyone have any advice? How can I modify Django to use a persistent connection or speed up the connection from Python to DB?

Django 1.6 has added persistent connections support (link to doc for latest stable Django ):
Persistent connections avoid the overhead of re-establishing a
connection to the database in each request. They’re controlled by the
CONN_MAX_AGE parameter which defines the maximum lifetime of a
connection. It can be set independently for each database.

Try PgBouncer - a lightweight connection pooler for PostgreSQL.
Features:
Several levels of brutality when rotating connections:
Session pooling
Transaction pooling
Statement pooling
Low memory requirements (2k per connection by default).

In Django trunk, edit django/db/__init__.py and comment out the line:
signals.request_finished.connect(close_connection)
This signal handler causes it to disconnect from the database after every request. I don't know what all of the side-effects of doing this will be, but it doesn't make any sense to start a new connection after every request; it destroys performance, as you've noticed.
I'm using this now, but I havn't done a full set of tests to see if anything breaks.
I don't know why everyone thinks this needs a new backend or a special connection pooler or other complex solutions. This seems very simple, though I don't doubt there are some obscure gotchas that made them do this in the first place--which should be dealt with more sensibly; 5ms overhead for every request is quite a lot for a high-performance service, as you've noticed. (It takes me 150ms--I havn't figured out why yet.)
Edit: another necessary change is in django/middleware/transaction.py; remove the two transaction.is_dirty() tests and always call commit() or rollback(). Otherwise, it won't commit a transaction if it only read from the database, which will leave locks open that should be closed.

I created a small Django patch that implements connection pooling of MySQL and PostgreSQL via sqlalchemy pooling.
This works perfectly on production of http://grandcapital.net/ for a long period of time.
The patch was written after googling the topic a bit.

Disclaimer: I have not tried this.
I believe you need to implement a custom database back end. There are a few examples on the web that shows how to implement a database back end with connection pooling.
Using a connection pool would probably be a good solution for you case, as the network connections are kept open when connections are returned to the pool.
This post accomplishes this by patching Django (one of the comments points out that it is better to implement a custom back end outside of the core django code)
This post is an implementation of a custom db back end
Both posts use MySQL - perhaps you are able to use similar techniques with Postgresql.
Edit:
The Django Book mentions Postgresql connection pooling, using pgpool (tutorial).
Someone posted a patch for the psycopg2 backend that implements connection pooling. I suggest creating a copy of the existing back end in your own project and patching that one.

This is a package for django connection pool:
django-db-connection-pool
pip install django-db-connection-pool
You can provide additional options to pass to SQLAlchemy's pool creation, key's name is POOL_OPTIONS:
DATABASES = {
'default': {
...
'POOL_OPTIONS' : {
'POOL_SIZE': 10,
'MAX_OVERFLOW': 10
}
...
}
}

I made some small custom psycopg2 backend that implements persistent connection using global variable.
With this I was able to improve the amout of requests per second from 350 to 1600 (on very simple page with few selects)
Just save it in the file called base.py in any directory (e.g. postgresql_psycopg2_persistent) and set in settings
DATABASE_ENGINE to projectname.postgresql_psycopg2_persistent
NOTE!!! the code is not threadsafe - you can't use it with python threads because of unexpectable results, in case of mod_wsgi please use prefork daemon mode with threads=1
# Custom DB backend postgresql_psycopg2 based
# implements persistent database connection using global variable
from django.db.backends.postgresql_psycopg2.base import DatabaseError, DatabaseWrapper as BaseDatabaseWrapper, \
IntegrityError
from psycopg2 import OperationalError
connection = None
class DatabaseWrapper(BaseDatabaseWrapper):
def _cursor(self, *args, **kwargs):
global connection
if connection is not None and self.connection is None:
try: # Check if connection is alive
connection.cursor().execute('SELECT 1')
except OperationalError: # The connection is not working, need reconnect
connection = None
else:
self.connection = connection
cursor = super(DatabaseWrapper, self)._cursor(*args, **kwargs)
if connection is None and self.connection is not None:
connection = self.connection
return cursor
def close(self):
if self.connection is not None:
self.connection.commit()
self.connection = None
Or here is a thread safe one, but python threads don't use multiple cores, so you won't get such performance boost as with previous one. You can use this one with multi process one too.
# Custom DB backend postgresql_psycopg2 based
# implements persistent database connection using thread local storage
from threading import local
from django.db.backends.postgresql_psycopg2.base import DatabaseError, \
DatabaseWrapper as BaseDatabaseWrapper, IntegrityError
from psycopg2 import OperationalError
threadlocal = local()
class DatabaseWrapper(BaseDatabaseWrapper):
def _cursor(self, *args, **kwargs):
if hasattr(threadlocal, 'connection') and threadlocal.connection is \
not None and self.connection is None:
try: # Check if connection is alive
threadlocal.connection.cursor().execute('SELECT 1')
except OperationalError: # The connection is not working, need reconnect
threadlocal.connection = None
else:
self.connection = threadlocal.connection
cursor = super(DatabaseWrapper, self)._cursor(*args, **kwargs)
if (not hasattr(threadlocal, 'connection') or threadlocal.connection \
is None) and self.connection is not None:
threadlocal.connection = self.connection
return cursor
def close(self):
if self.connection is not None:
self.connection.commit()
self.connection = None

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

ZODB commit stuck and make my whole application to freeze - python

Related

Closing django ORM connections in a multi-threaded environment

Acquiring pool connections in Python Gino (async)

Preservice a state in python spyne (like a db connection)

Does this thread-local Flask-SQLAchemy session cause a "MySQL server has gone away" error?

Django persistent database connection

Categories

Resources