Database operation hangs when pytest test asserting query result rowcount fails

Database operation hangs when pytest test asserting query result rowcount fails - python

The following test works fine:
import pytest
from sqlalchemy import create_engine, text
from sqlalchemy.engine.url import URL
from sqlalchemy_utils import database_exists, create_database
#pytest.fixture()
def db_engine():
engine = create_engine(
URL('postgres',
username='postgres', password='postgres', database='my_database')
)
if not database_exists(engine.url):
create_database(engine.url)
return engine
def test_new_table_is_empty(db_engine):
try:
db_engine.execute(text('CREATE SCHEMA test_schema;'))
db_engine.execute(text('CREATE TABLE test_schema.new_table ();'))
result = db_engine.execute(text('SELECT * FROM test_schema.new_table;'))
assert result.rowcount == 0
finally:
try:
del result # The result would block the dropping of
# SCHEMA "test_schema" in the following cleanup.
except NameError:
pass
db_engine.execute(text('DROP SCHEMA IF EXISTS test_schema CASCADE;'))
But if I make it fail by changing assert result.rowcount == 0 to assert result.rowcount == 1, it will hang indefinitely on the last line (where the schema should be dropped) and can't even be aborted by [Ctrl+c]. I have to kill the py.test process (or the python process, depending on how I invoked the test runner) to terminate it. (If I append
if __name__ == "__main__":
test_new_table_is_empty(db_engine())
and run the file as a normal python script instead of through py.test, I get the expected AssertionError.)
However, if I just replace the assertion by assert False (and run with py.test again), the test suit finishes with one failed test, as expected. Thus, I assume that pytest retains a reference to result iff the assertion failed, probably for the error analysis it displays with the stack trace. Is that the case?
How can and should I avoid the blocking? Should I only ever make test assertions on data fetched from the result rather than on properties of the ResultProxy itself?

TL;DR
Call
result.close()
instead of
del result
Answers to your specific questions:
I assume that pytest retains a reference to result iff the assertion failed, probably for the error analysis it displays with the stack trace. Is that the case?
That is still my working assumption. If someone knows better, please enlighten me.
How can and should I avoid the blocking?
Instead of deleting the ResultProxy result, explicitly close() it in your finally clause:
def test_new_table_is_empty(db_engine):
try:
db_engine.execute(text('CREATE SCHEMA test_schema;'))
db_engine.execute(text('CREATE TABLE test_schema.new_table ();'))
result = db_engine.execute(text('SELECT * FROM test_schema.new_table;'))
assert result.rowcount == 0
finally:
try:
result.close() # Release row and table locks.
except NameError:
pass
db_engine.execute(text('DROP SCHEMA IF EXISTS test_schema CASCADE;'))
That will release all row and table locks held by result.
To avoid the clutter of nested try clauses move the clutter of nested try clauses somewhere else, you can use with contextlib.closing(...)::
from contextlib import closing
# ...
def test_new_table_is_empty(db_engine):
try:
db_engine.execute(text('CREATE SCHEMA test_schema;'))
db_engine.execute(text('CREATE TABLE test_schema.new_table ();'))
with closing(
db_engine.execute(text('SELECT * FROM test_schema.new_table;'))
) as result:
assert result.rowcount == 0
finally:
db_engine.execute(text('DROP SCHEMA IF EXISTS test_schema CASCADE;'))
Should I only ever make test assertions on data fetched from the result rather than on properties of the ResultProxy itself?
That would only work if you fetch all rows, thereby exhausting the ResultProxy, which would implicitly _soft_close() it. If the result had (maybe unexpectedly) more rows than you'd fetch, the result would stay open and continue to potentially hold locks that'd keep the following cleanup from being executed.
As you're only interested in the rowcount in your test and not in the actual result's content, explicitly closing is the better choice than fetching results you won't use, except maybe for counting them or computing their length.

Related

Query execution hanging in specific circumstances

The problem
For a while now i've encountered a bug where a data retrieval query keeps hanging during execution. If that was all, then debugging would be fine, but it is not easy to reproduce, namely:
It only occurs on my linux laptop (manjaro xfce), with no problems on my windows pc
Primarily occurs on a few specific timestamps (mostly 4:05)
Even then doesn't consistently appear
I know how this can be fixed (by prepending the query with a select 1;), but i don't understand why the problem occurs, and why my solution fixes it, which is where i'm stuck. I've not seen any other problems that specifically describe this issue.
Code
The query in question below. What is does is select a range of measurements, and then averaging those measurements per timestep (and interpolating in case it's necessary) to get a range of averages.
SELECT datetime, AVG(wc) as wc
FROM (
SELECT public.time_bucket_gapfill('5 minutes', m.datetime)
AS datetime, public.interpolate(AVG(m.wc)) as wc
FROM growficient.measurement AS m
INNER JOIN growficient.placement AS p ON m.placement_id = p.id
WHERE m.datetime >= '2022-09-30T22:00:00+00:00'
AND m.datetime < '2022-10-01T04:05:00+00:00'
AND p.section_id = 'bd5114b8-4aab-11eb-af66-32bd66d4e25c'
GROUP BY public.time_bucket_gapfill('5 minutes', m.datetime), p.id
) AS placement_averages
GROUP BY datetime
ORDER BY datetime;
This is then executed via SQLAlchemy on a session level. In case the bug appears, it never gets to the fetchall().
execute_result = session.execute(query)
readings = execute_result.fetchall()
We're using session management very similar to what's seen in the SQLAlchemy documentation. This is meant to be a debug-session however, meaning that no commit statements are included.
sessionMaker = sessionmaker(
autocommit=False,
autoflush=False,
bind=create_engine(
config.get_settings().main_db,
echo=False,
connect_args=connect_options,
pool_pre_ping=True,
),
)
#contextlib.contextmanager
def managed_session() -> Session:
session = sessionMaker()
try:
yield session
except Exception as e:
session.rollback()
logger.error("Session error: %s", e)
raise
finally:
session.close()
Observations
I can visually see the transaction hanging if i execute select * from pg_catalog.pg_stat_activity psa
Printing the identical query and then executing it inside the database directly (i.e. dbeaver) correctly returns the results
None of the timeouts mentioned in the Postgres documentation do anything to break out of the hang
Adding a SELECT 1; statement works, but setting pool_pre_ping=True in the engine doesn't, which confuses me as they do the same thing to my understanding.

SQLAlchemy / MySQL Deadlocks on serialized access

I have a big problem with a deadlock in an InnoDB table used with sqlalchemy.
sqlalchemy.exc.InternalError: (mysql.connector.errors.InternalError) 1213 (40001): Deadlock found when trying to get lock; try restarting transaction.
I have already serialized the access, but still get a deadlock error.
This code is executed on the first call in every function. Every thread and process should wait here, till it gets the lock. It's simplified, as selectors are removed.
# The work with the index -1 always exists.
f = s.query(WorkerInProgress).with_for_update().filter(
WorkerInProgress.offset == -1).first()
I have reduced my code to a minimal state. I am currently running only concurrent calls on the method next_slice. Session handling, rollback and deadloc handling are handled outside.
I get deadlocks even all access is serialized. I did tried to increment a retry counter in the offset == -1 entity as well.
def next_slice(self, s, processgroup_id, itemcount):
f = s.query(WorkerInProgress).with_for_update().filter(
WorkerInProgress.offset == -1).first()
#Take first matching object if available / Maybe some workers failed
item = s.query(WorkerInProgress).with_for_update().filter(
WorkerInProgress.processgroup_id != processgroup_id,
WorkerInProgress.processgroup_id != 'finished',
WorkerInProgress.processgroup_id != 'finished!locked',
WorkerInProgress.offset != -1
).order_by(WorkerInProgress.offset.asc()).limit(1).first()
# *****
# Some code is missing here. as it's not executed in my testcase
# Fetch the latest item and add a new one
item = s.query(WorkerInProgress).with_for_update().order_by(
WorkerInProgress.offset.desc()).limit(1).first()
new = WorkerInProgress()
new.offset = item.offset + item.count
new.count = itemcount
new.maxtries = 3
new.processgroup_id = processgroup_id
s.add(new)
s.commit()
return new.offset, new.count
I don't understand why the deadlocks are occurring.
I have reduced deadlock by fetching all items in one query, but still get deadlocks. Perhaps someone can help me.

Finally I solved my problem. It's all in the documentation, but I have to understand it first.
Always be prepared to re-issue a transaction if it fails due to
deadlock. Deadlocks are not dangerous. Just try again.
Source: http://dev.mysql.com/doc/refman/5.7/en/innodb-deadlocks-handling.html
I have solved my problem by changing the architecture of this part. I still get a lot of deadlocks, but they appear almost in the short running methods.
I have splitted my worker table to a locking and an non locking part. The actions on the locking part are now very short and no data is handling during the get_slice, finish_slice and fail_slice operations.
The transaction part with data handling are now in a non locking part and without concurrent access to table rows. The results are stored in finish_slice and fail_slice to the locking table.
Finally I have found a good description on stackoverflow too. After identifying the right search terms.
https://stackoverflow.com/a/2596101/5532934

Better approach to handling sqlalchemy disconnects

We've been experimenting with sqlalchemy's disconnect handling, and how it integrates with ORM. We've studied the docs, and the advice seems to be to catch the disconnect exception, issue a rollback() and retry the code.
eg:
import sqlalchemy as SA
retry = 2
while retry:
retry -= 1
try:
for name in session.query(Names):
print name
break
except SA.exc.DBAPIError as exc:
if retry and exc.connection_invalidated:
session.rollback()
else:
raise
I follow the rationale -- you have to rollback any active transactions and replay them to ensure a consistent ordering of your actions.
BUT -- this means a lot of extra code added to every function that wants to work with data. Furthermore, in the case of SELECT, we're not modifying data and the concept of rollback/re-request is not only unsightly, but a violation of the principle of DRY (don't repeat yourself).
I was wondering if others would mind sharing how they handle disconnects with sqlalchemy.
FYI: we're using sqlalchemy 0.9.8 and Postgres 9.2.9

The way I like to approach this is place all my database code in a lambda or closure, and pass that into a helper function that will handle catching the disconnect exception, and retrying.
So with your example:
import sqlalchemy as SA
def main():
def query():
for name in session.query(Names):
print name
run_query(query)
def run_query(f, attempts=2):
while attempts > 0:
attempts -= 1
try:
return f() # "break" if query was successful and return any results
except SA.exc.DBAPIError as exc:
if attempts > 0 and exc.connection_invalidated:
session.rollback()
else:
raise
You can make this more fancy by passing a boolean into run_query to handle the case where you are only doing a read, and therefore want to retry without rolling back.
This helps you satisfy the DRY principle since all the ugly boiler-plate code for managing retries + rollbacks is placed in one location.

Using exponential backoff (https://github.com/litl/backoff):
#backoff.on_exception(
backoff.expo,
sqlalchemy.exc.DBAPIError,
factor=7,
max_tries=3,
on_backoff=lambda details: LocalSession.get_main_sql_session().rollback(),
on_giveup=lambda details: LocalSession.get_main_sql_session().flush(), # flush the session
logger=logging
)
def pessimistic_insertion(document_metadata):
LocalSession.get_main_sql_session().add(document_metadata)
LocalSession.get_main_sql_session().commit()
Assuming that LocalSession.get_main_sql_session() returns a singleton.

How can I do pre checking ... before executing a function in the Python?

I have functions in which I am doing database operations. I want to do something special,before trying to fetch data from database, I want to check whether cursor object is null and whether connection is dropped due to time out. How can I do pre checking in the Python ?
my functions, foo, bar :
Class A:
connect = None
cursor = None
def connect(self)
self.connect = MySQLdb.connect ( ... )
self.cursor = self.connect.cursor()
def foo (self) :
self.cursor.execute("SELECT * FROM car_models");
def bar (self) :
self.cursor.execute("SELECT * FROM incomes");

The problem with checking before an operation is that the bad condition you're checking for could happen between the check and the operation.
For example, suppose you had a method is_timed_out() to check if the connection had timed out:
if not self.cursor.is_timed_out():
self.cursor.execute("SELECT * FROM incomes")
On the face of it, this looks like you've avoided the possibility of a CursorTimedOut exception from the execute call. But it's possible for you to call is_timed_out, get a False back, then the cursor times out, and then you call the execute function, and get an exception.
Yes, the chance is very small that it will happen at just the right moment. But in a server environment, a one-in-a-million chance will happen a few times a day. Bad stuff.
You have to be prepared for your operations to fail with exceptions. And once you've got exception handling in place for those problems, you don't need the pre-checks any more, because they are redundant.

You can check whether the cursor is null easily:
if cursor is None:
... do something ...
Otherwise the usual thing in Python is to "ask for forgiveness not permission": use your database connection and if it has timed out catch the exception and handle it (otherwise you might just find that it times out between your test and the point where you use it).

SQLAlchemy autocommiting?

I have an issue with SQLAlchemy apparently committing. A rough sketch of my code:
trans = self.conn.begin()
try:
assert not self.conn.execute(my_obj.__table__.select(my_obj.id == id)).first()
self.conn.execute(my_obj.__table__.insert().values(id=id))
assert not self.conn.execute(my_obj.__table__.select(my_obj.id == id)).first()
except:
trans.rollback()
raise
I don't commit, and the second assert always fails! In other words, it seems the data is getting inserted into the database even though the code is within a transaction! Is this assessment accurate?

You're right in that changes aren't get commited to DB. But they are auto-flushed by SQLAlchemy when you perform query, in your case flush is performed on lines with asserts. So if you will not explicitly call commit you will never see these changes in DB, within real data. However, you will get them back as long as you use the same conn object.
You can pass autoflush=False to session constructor do disable this behavior.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Database operation hangs when pytest test asserting query result rowcount fails - python

Related

Query execution hanging in specific circumstances

SQLAlchemy / MySQL Deadlocks on serialized access

Better approach to handling sqlalchemy disconnects

How can I do pre checking ... before executing a function in the Python?

SQLAlchemy autocommiting?

Categories

Resources