query non-deterministically hanging on self._clslevel[target] = collections.deque() - python

I have a program running the following query via the Sql Alchemy ORM. At this particular point in the code, it is running serially.
//ms sql database setup
engine = create_engine(conn_string, pool_size=pool_size, pool_recycle=3600, echo=False)
engine.execute('set transaction isolation level read uncommitted')
session = scoped_session(sessionmaker(bind=engine, autocommit=autocommit))
//hangs in here
max_id = session.query(func.max(entity.entity_id)).all()[0][0]
Occasionally, the program hangs as if it is deadlocked. Execution does not proceed for hours. I have ruled out any issues in database connection or blocked queries as a periodic stack trace of my main calling thread shows the execution hanging at the following spot (see StackTrace below).
It seems like a dictionary is being updated. If SqlAlchemy were to be trying to do anything fancy with threads underneath the hood, weakref.py uses a WeakValueDictionary from https://docs.python.org/3/library/weakref.html#weakref.WeakValueDictionary, in which self.data is a native python dictionary which should be safe under concurrent access.
What could be causing this intermittent hanging? I am on SQL Alchemy 1.0.12 and Python 3.5.2.
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2588, in all
return list(self)
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2732, in __iter__
context = self._compile_context()
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 3180, in _compile_context
if self.dispatch.before_compile:
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/event/base.py", line 288, in __get__
obj.__dict__['dispatch'] = disp = self.dispatch_cls._for_instance(obj)
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/event/base.py", line 110, in _for_instance
return self._for_class(instance_cls)
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/event/base.py", line 106, in _for_class
return self.__class__(self, instance_cls)
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/event/base.py", line 84, in __init__
for ls in parent._event_descriptors
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/event/base.py", line 84, in <genexpr>
for ls in parent._event_descriptors
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/event/attr.py", line 188, in __init__
parent.update_subclass(target_cls)
File "/home/me/py35/lib/python3.5/site-packages/sqlalchemy/event/attr.py", line 125, in update_subclass
self._clslevel[target] = collections.deque()
File "/home/me/py35/lib/python3.5/weakref.py", line 378, in __setitem__
self.data[ref(key, self._remove)] = value

Related

Querying on mysql docker container via python, throwing timeout error after few hours

Inserting via debezium connector to mysql database brought up via docker container.
Trying to query and it is working fine until some number of hours. But, after that, same query is throwing below exception.
export JAVA_HOME=/tmp/tests/artifacts/java-17/jdk-17; export PATH=$PATH:/tmp/tests/artifacts/java-17/jdk-17/bin; docker exec -i mysql_be1e6a mysql --user=demo --password=demo -D demo -e "select count(k) from test_cdc_f0bf84 where uuid = 'd1e5cd6d-8f7a-457c-b2ea-880c2be52f69'"
2023-01-02 16:27:43,812:ERROR: failed to execute query MySQL rows count by uuid:
Traceback (most recent call last):
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 699, in recv
out = self.in_buffer.read(nbytes, self.timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/buffered_pipe.py", line 164, in read
raise PipeTimeout()
paramiko.buffered_pipe.PipeTimeout
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/abstract.py", line 667, in try_query
res = query_function()
^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/test_cdc.py", line 635, in <lambda>
query = lambda: self.mysql_query(
^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/abstract.py", line 544, in mysql_query
result = self.ssh.exec_on_host(host, [
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/main/connection.py", line 335, in exec_on_host
return self._exec_on_host(host, commands, fetch, timeout=timeout, limit_output=limit_output)[host]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/main/connection.py", line 321, in _exec_on_host
res = list(out)
^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/file.py", line 125, in __next__
line = self.readline()
^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/file.py", line 291, in readline
new_data = self._read(n)
^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 1361, in _read
return self.channel.recv(size)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 701, in recv
raise socket.timeout()
TimeoutError
After some time, logged manually to machine and tried to read, it still reads fine. Not sure, what does this issue mean.
As explained, tried querying from database via python. Expected it will return count of rows, which it was happening until certain time, but after that, it threw timeout error and socket error.
Trying to query and it is working fine until some number of hours. But, after that, same query is throwing below exception.
The default value for interactive_timeout and wait_timeout is 28880 seconds (8 hours). you can disable this behavior by setting this system variable to zero in your MySQL config.
source: Configuring session timeouts

Abort connection when database is read-only (Flask/SQLAlchemy)

I am facing the following issue:
We have configured failover DB nodes for our staging environment. When testing, sometimes the failover happens and Flask keeps open connections to some nodes which are now read-only -- any write operation then fails:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python3.7/site-packages/elasticapm/instrumentation/packages/dbapi2.py", line 210, in execute
return self.trace_sql(self.wrapped_.execute, sql, params)
File "/usr/local/lib/python3.7/site-packages/elasticapm/instrumentation/packages/dbapi2.py", line 244, in _trace_sql
result = method(sql, params)
psycopg2.errors.ReadOnlySqlTransaction: cannot execute DELETE in a read-only transaction
I'd like to detect this somehow and close the connection to these nodes, so that any write operation succeeds. Is this possible?
You can import that error class into your module, and then use it in a try-except block:
from psycopg2.errors import ReadOnlySqlTransaction
try:
# your main stuff here
except ReadOnlySqlTransaction:
# terminate the connection

UPDATE statement on table 'xxx' expected to update 1 row(s); 2 were matched [duplicate]

I am running a Pyramid + Zope transaction manager + SQLAlchemy + PostgreSQL. On some occasions, I have seen StaleDataError error on a Pyramid web application which should very trivial view of updating one row in a database. As the error happens outside the normal view boundary and is not repeatable, it is quite tricky to debug.
I guess this might have something to do with broken database connections or transaction lifecycle. However I don't know how to start debugging the system, so I am asking what could cause this and furthermore how one can pin down errors like this.
UPDATE statement on table 'xxx' expected to update 1 row(s); 0 were matched.
Stacktrace (most recent call last):
File "pyramid/tweens.py", line 20, in excview_tween
response = handler(request)
File "pyramid_tm/__init__.py", line 94, in tm_tween
reraise(*exc_info)
File "pyramid_tm/compat.py", line 15, in reraise
raise value
File "pyramid_tm/__init__.py", line 82, in tm_tween
manager.commit()
File "transaction/_manager.py", line 111, in commit
return self.get().commit()
File "transaction/_transaction.py", line 280, in commit
reraise(t, v, tb)
File "transaction/_compat.py", line 55, in reraise
raise value
File "transaction/_transaction.py", line 271, in commit
self._commitResources()
File "transaction/_transaction.py", line 417, in _commitResources
reraise(t, v, tb)
File "transaction/_compat.py", line 55, in reraise
raise value
File "transaction/_transaction.py", line 389, in _commitResources
rm.tpc_begin(self)
File "/srv/pyramid/trees/venv/lib/python3.4/site-packages/zope/sqlalchemy/datamanager.py", line 90, in tpc_begin
self.session.flush()
File "sqlalchemy/orm/session.py", line 2004, in flush
self._flush(objects)
File "sqlalchemy/orm/session.py", line 2122, in _flush
transaction.rollback(_capture_exception=True)
File "sqlalchemy/util/langhelpers.py", line 60, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File "sqlalchemy/util/compat.py", line 182, in reraise
raise value
File "sqlalchemy/orm/session.py", line 2086, in _flush
flush_context.execute()
File "sqlalchemy/orm/unitofwork.py", line 373, in execute
rec.execute(self)
File "sqlalchemy/orm/unitofwork.py", line 532, in execute
uow
File "sqlalchemy/orm/persistence.py", line 170, in save_obj
mapper, table, update)
File "sqlalchemy/orm/persistence.py", line 692, in _emit_update_statements
(table.description, len(records), rows))
This is most likely scenario:
You have 2 requests that first select an object and try to update/delete it in the datastore and you end up with a "race condition".
Lets say you want to do something like fetch an object and then update it.
If transaction takes some time and you do not select the object with "for update" thus locking the rows - if the object gets deleted in first request and 2nd transaction tries to issue update to row that is not present in the db anymore you can end up with this exception.
You can try doing some row locking to prevent this from happening - subsequent transaction will "wait" for the first operation to finish. Before it gets executed.
http://docs.sqlalchemy.org/en/rel_1_0/orm/query.html?highlight=for_update#sqlalchemy.orm.query.Query.with_for_update
and
http://docs.sqlalchemy.org/en/rel_1_0/orm/query.html?highlight=with_lockmode#sqlalchemy.orm.query.Query.with_lockmode
Describe some of the sqlalchemy machinery you can use to resolve this.
Another option:
TL,DR: if you have "first()" in some cases, you need to remove this in alchemy if you update multiple records
db.session.query(xxx).filter_by(field=value).first()
This command expects the update to affect only one line. And it should if your table has only one record with field=value. This is particularly the case if the field is your ID.
HOWEVER - if your ID is not unique, you might have multiple records with the same ID.
In this case, your can update all by removing "first()"
BTW, use the following to debug your SQL queries (which wouldn't have helped this time...)
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

django.db.utils.InterfaceError: connection already closed failures when updating to Django 3.0

I am updating a medium-sized project to Django 3.0 and I am encountering several errors in my tests after doing nothing more than bumping the Django version from 2.3.
The whole test suite has been running correctly for years and I couldn't find any relevant change in the changelog that may point to the cause of this issue. Apparently a single test fail is triggering every remaining test in the same TestCase class to fail with the following exception:
Traceback (most recent call last):
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/backends/base/base.py", line 238, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/utils/asyncio.py", line 24, in inner
return func(*args, **kwargs)
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/backends/postgresql/base.py", line 231, in create_cursor
cursor = self.connection.cursor()
psycopg2.InterfaceError: connection already closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/federicobond/code/forks/core/apps/participants/tests/test_views.py", line 40, in setUp
self.client.force_login(self.user)
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/test/client.py", line 602, in force_login
self._login(user, backend)
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/test/client.py", line 611, in _login
if self.session:
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/test/client.py", line 461, in session
session.save()
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/contrib/sessions/backends/db.py", line 81, in save
return self.create()
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/contrib/sessions/backends/db.py", line 51, in create
self._session_key = self._get_new_session_key()
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/contrib/sessions/backends/base.py", line 162, in _get_new_session_key
if not self.exists(session_key):
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/contrib/sessions/backends/db.py", line 47, in exists
return self.model.objects.filter(session_key=session_key).exists()
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/models/query.py", line 777, in exists
return self.query.has_results(using=self.db)
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/models/sql/query.py", line 534, in has_results
return compiler.has_results()
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1107, in has_results
return bool(self.execute_sql(SINGLE))
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1135, in execute_sql
cursor = self.connection.cursor()
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/utils/asyncio.py", line 24, in inner
return func(*args, **kwargs)
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/backends/base/base.py", line 260, in cursor
return self._cursor()
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/backends/base/base.py", line 238, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/backends/base/base.py", line 238, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/utils/asyncio.py", line 24, in inner
return func(*args, **kwargs)
File "/Users/federicobond/code/forks/core/env/lib/python3.7/site-packages/django/db/backends/postgresql/base.py", line 231, in create_cursor
cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed
I am out of ideas as to what could be going on here.
I ran into this as well. It appears to be a bug in pytest-django. Here's the relevant issue. There's an open PR to resolve it. If it's a big enough inconvenience you can use the branch in that PR or pin your dependencies to an earlier version.
We were hitting the same issue, and upgrading from Django 3.0.2 to Django-3.0.4 resolved it. There are several DB related fixes in those two versions, but I don't know which one solved our problem.
Just a forward note, it's near impossible to provide more information than already is in the stacktrace. However, you can investigate:
Check when the connection get closed and by which test (run them individually via script for example).
For tests that fail, check the code for deprecated parts of Django (searching for things that were removed/deprecated between 2.3 and 3.0).
Run a linter to see if someone changed a private variable inside of Django framework as workaround.
Check the transactions of postgres.
Then once you have which part of the code has the error narrow it down by creating smaller failing tests.
I had the same issue using pytest.
Downgrading from 5.4.1 to 5.3.5 fixed it.
All these issues occur because of the incompatibility of the other packages with django 3.0
when i ran into this error i updated my requirement.txt file manually and then intalled all the requirements using pip in the same env.

Cassandra Celery python timeout happens on raw query execution using django db connection execute

My celery is configured for the Cassandra session like this:
def cassandra_init(*args, **kwargs):
""" Initialize a clean Cassandra connection. """
if cql_cluster is not None:
cql_cluster.shutdown()
if cql_session is not None:
cql_session.shutdown()
connection.setup([settings.DATABASES["default"]["HOST"],], settings.DATABASES["default"]["NAME"])
# Initialize worker context (only standard tasks)
worker_process_init.connect(cassandra_init)
When I am executing a raw cassandra query, timeout happens,
from django.db import connection
cursor = connection.cursor()
total_ap = cursor.execute(
"SELECT cpu_info FROM ap_live_stats;")
It works well all over in my django project but not inside the celery tasks.
Error:
[2018-05-09 18:50:21,576: ERROR/ForkPoolWorker-5] Task apps.statistic.tasks.ap_hourly_data_migrator[77a596d4-61a2-43f4-8580-6abc6e9b5866] raised unexpected: OperationTimedOut("errors={'192.168.98.65': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=192.168.98.65",)
Traceback (most recent call last):
File "/home/vkchlt0079/virtuals/wlc-env/lib/python3.5/site-packages/celery/app/trace.py", line 374, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/vkchlt0079/virtuals/wlc-env/lib/python3.5/site-packages/celery/app/trace.py", line 629, in __protected_call__
return self.run(*args, **kwargs)
File "/home/vkchlt0079/projects/wlcd/src/web_gui/backend/django/wlcd/apps/statistic/tasks.py", line 59, in ap_hourly_data_migrator
"SELECT cpu_info FROM ap_live_stats;")
File "/home/vkchlt0079/virtuals/wlc-env/lib/python3.5/site-packages/django_cassandra_engine/utils.py", line 47, in execute
return self.cursor.execute(sql)
File "/home/vkchlt0079/virtuals/wlc-env/lib/python3.5/site-packages/django_cassandra_engine/connection.py", line 12, in execute
return self.connection.execute(*args, **kwargs)
File "/home/vkchlt0079/virtuals/wlc-env/lib/python3.5/site-packages/django_cassandra_engine/connection.py", line 86, in execute
self.session.set_keyspace(self.keyspace)
File "cassandra/cluster.py", line 2448, in cassandra.cluster.Session.set_keyspace (cassandra/cluster.c:48048)
File "cassandra/cluster.py", line 2030, in cassandra.cluster.Session.execute (cassandra/cluster.c:38536)
File "cassandra/cluster.py", line 3844, in cassandra.cluster.ResponseFuture.result (cassandra/cluster.c:80834)
cassandra.OperationTimedOut: errors={'192.168.98.65': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=192.168.98.65
Tried to increase the timeout, but not working and not sure, where it is to be included.
# project/tasks.py
from celery.signals import worker_init
from django.db import connection
#worker_process_init.connect
def connect_db(**kwargs):
connection.reconnect()
This will initiate the db connection required via Django Cassandra engine. Reference

Categories

Resources