Python cassandra-driver OperationTimeOut on every query in Celery task

Python cassandra-driver OperationTimeOut on every query in Celery task - python

I have a problem with every insert query (little query) which is executed in celery tasks asynchronously.
In sync mode when i do insert all done great, but when it executed in apply_async() i get this:
OperationTimedOut('errors=errors=errors={}, last_host=***.***.*.***, last_host=None, last_host=None',)
Traceback:
Traceback (most recent call last):
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/celery/app/trace.py", line 437, in __protected_call__
return self.run(*args, **kwargs)
File "/var/nfs_www/***/www_v1/app/mods/news_feed/tasks.py", line 26, in send_new_comment_reply_notifications
send_new_comment_reply_notifications_method(comment_id)
File "/var/nfs_www/***www_v1/app/mods/news_feed/methods.py", line 83, in send_new_comment_reply_notifications
comment_type='comment_reply'
File "/var/nfs_www/***/www_v1/app/mods/news_feed/models/storage.py", line 129, in add
CommentsFeed(**kwargs).save()
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cqlengine/models.py", line 531, in save
consistency=self.__consistency__).save()
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cqlengine/query.py", line 907, in save
self._execute(insert)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cqlengine/query.py", line 786, in _execute
tmp = execute(q, consistency_level=self._consistency)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cqlengine/connection.py", line 95, in execute
result = session.execute(query, params)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cassandra/cluster.py", line 1103, in execute
result = future.result(timeout)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cassandra/cluster.py", line 2475, in result
raise OperationTimedOut(errors=self._errors, last_host=self._current_host)
OperationTimedOut: errors={}, last_host=***.***.*.***
Does anyone have ideas about problem?
I found this When cassandra-driver was executing the query, cassandra-driver returned error OperationTimedOut, but my query is very little and problem only in celery tasks.
UPDATE:
I made a test task and it raises this error too.
#celery.task()
def test_task_with_cassandra():
from app import cassandra_session
cassandra_session.execute('use news_feed')
return 'Done'
UPDATE 2:
Made this:
#celery.task()
def test_task_with_cassandra():
from cqlengine import connection
connection.setup(app.config['CASSANDRA_SERVERS'], port=app.config['CASSANDRA_PORT'],
default_keyspace='test_keyspace')
from .models import Feed
Feed.objects.count()
return 'Done'
Got this:
NoHostAvailable('Unable to connect to any servers', {'***.***.*.***': OperationTimedOut('errors=errors=Timed out creating connection, last_host=None, last_host=None',)})
From shell i can connect to it
UPDATE 3:
From deleted thread on github issue (found this in my emails): (this worked for me too)
Here's how, in substance, I plug CQLengine to Celery:
from celery import Celery
from celery.signals import worker_process_init, beat_init
from cqlengine import connection
from cqlengine.connection import (
cluster as cql_cluster, session as cql_session)
def cassandra_init():
""" Initialize a clean Cassandra connection. """
if cql_cluster is not None:
cql_cluster.shutdown()
if cql_session is not None:
cql_session.shutdown()
connection.setup()
# Initialize worker context for both standard and periodic tasks.
worker_process_init.connect(cassandra_init)
beat_init.connect(cassandra_init)
app = Celery()
This is crude, but works. Should we add this snippet in the FAQ ?

I had a similar issue. It seemed to be related to sharing the Cassandra session between tasks. I solved it by creating a session per thread. Make sure you call get_session() from you tasks and then do this:
thread_local = threading.local()
def get_session():
if hasattr(thread_local, "cassandra_session"):
return thread_local.cassandra_session
cluster = Cluster(settings.CASSANDRA_HOSTS)
session = cluster.connect(settings.CASSANDRA_KEYSPACE)
thread_local.cassandra_session = session
return session

Inspired by Ron's answer, I come up with the following code to put in tasks.py:
import threading
from django.conf import settings
from cassandra.cluster import Cluster
from celery.signals import worker_process_init,worker_process_shutdown
thread_local = threading.local()
#worker_process_init.connect
def open_cassandra_session(*args, **kwargs):
cluster = Cluster([settings.DATABASES["cassandra"]["HOST"],], protocol_version=3)
session = cluster.connect(settings.DATABASES["cassandra"]["NAME"])
thread_local.cassandra_session = session
#worker_process_shutdown.connect
def close_cassandra_session(*args,**kwargs):
session = thread_local.cassandra_session
session.shutdown()
thread_local.cassandra_session = None
This neat solution will automatically open/close cassandra sessions when celery worker process starts and stops.
Side note: protocol_version=3, because Cassandra 2.1 only supports protocol versions 3 and lower.

The other answers didn't work for me, but the question's 'update 3' did. Here's what I ended up with (small updates to the suggestion within the question):
from celery.signals import worker_process_init
from cassandra.cqlengine import connection
from cassandra.cqlengine.connection import (
cluster as cql_cluster, session as cql_session)
def cassandra_init(*args, **kwargs):
""" Initialize a clean Cassandra connection. """
if cql_cluster is not None:
cql_cluster.shutdown()
if cql_session is not None:
cql_session.shutdown()
connection.setup(settings.DATABASES["cassandra"]["HOST"].split(','), settings.DATABASES["cassandra"]["NAME"])
# Initialize worker context (only standard tasks)
worker_process_init.connect(cassandra_init)

Using django-cassandra-engine the following resolved the issue for me:
db_connection = connections['cassandra']
#worker_process_init.connect
def connect_db(**_):
db_connection.reconnect()
#worker_shutdown.connect
def disconnect(**_):
db_connection.connection.close_all()
look at here

Related

How to set a query timeout in sqlalchemy using Oracle database?

I want to create a query timeout in sqlalchemy. I have an oracle database.
I have tried following code:
import sqlalchemy
engine = sqlalchemy.create_engine('oracle://db', connect_args={'querytimeout': 10})
I got following error:
TypeError: 'querytimeout' is an invalid keyword argument for this function
I would like a solution looking like:
connection.execute('query').set_timeout(10)
Maybe it is possible to set timeout in sql query? I found how to do it in pl/sql, but i need just sql.
How could i set a query timeout?

The only way how you can set connection timeout for the Oracle engine from the Sqlalchemy is create and configure the sqlnet.ora
Linux
Create file sqlnet.ora in folder
/opt/oracle/instantclient_19_9/network/admin
Windows
For windows please create such folder as \network\admin
C:\oracle\instantclient_19_9\network\admin
Example sqlnet.ora file
SQLNET.INBOUND.CONNECT_TIMEOUT = 120
SQLNET.SEND_TIMEOUT = 120
SQLNET.RECV_TIMEOUT = 120
More parameters you can find here https://docs.oracle.com/cd/E11882_01/network.112/e10835/sqlnet.htm

The way to do it in Oracle is via resource manager. Have a look here

timeout decorator
Get your session handle as you normally would. (Notice that the session has not actually connected yet.) Then, test the session in a function that is decorated with wrapt_timeout_decorator.timeout.
#!/usr/bin/env python3
from time import time
from cx_Oracle import makedsn
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.sql import text
from wrapt_timeout_decorator import timeout
class ConnectionTimedOut(Exception):
pass
class Blog:
def __init__(self):
self.port = None
def connect(self, connection_timeout):
#timeout(connection_timeout, timeout_exception=ConnectionTimedOut)
def test_session(session):
session.execute(text('select dummy from dual'))
session = sessionmaker(bind=self.engine())()
test_session(session)
return session
def engine(self):
return create_engine(
self.connection_string(),
max_identifier_length=128
)
def connection_string(self):
driver = 'oracle'
username = 'USR'
password = 'solarwinds123'
return '%s://%s:%s#%s' % (
driver,
username,
password,
self.dsn()
)
def dsn(self):
host = 'hn.com'
dbname = 'ORCL'
print('port: %s expected: %s' % (
self.port,
'success' if self.port == 1530 else 'timeout'
))
return makedsn(host, self.port, dbname)
def run(self):
self.port = 1530
session = self.connect(connection_timeout=4)
for r in session.execute(text('select status from v$instance')):
print(r.status)
self.port = 1520
session = self.connect(connection_timeout=4)
for r in session.execute(text('select status from v$instance')):
print(r.status)
if __name__ == '__main__':
Blog().run()
In this example, the network is firewalled with port 1530 open. Port 1520 is blocked and leads to a TCP connection timeout. Output:
port: 1530 expected: success
OPEN
port: 1520 expected: timeout
Traceback (most recent call last):
File "./blog.py", line 68, in <module>
Blog().run()
File "./blog.py", line 62, in run
session = self.connect(connection_timeout=4)
File "./blog.py", line 27, in connect
test_session(session)
File "/home/exagriddba/lib/python3.8/site-packages/wrapt_timeout_decorator/wrapt_timeout_decorator.py", line 123, in wrapper
return wrapped_with_timeout(wrap_helper)
File "/home/exagriddba/lib/python3.8/site-packages/wrapt_timeout_decorator/wrapt_timeout_decorator.py", line 131, in wrapped_with_timeout
return wrapped_with_timeout_process(wrap_helper)
File "/home/exagriddba/lib/python3.8/site-packages/wrapt_timeout_decorator/wrapt_timeout_decorator.py", line 145, in wrapped_with_timeout_process
return timeout_wrapper()
File "/home/exagriddba/lib/python3.8/site-packages/wrapt_timeout_decorator/wrap_function_multiprocess.py", line 43, in __call__
self.cancel()
File "/home/exagriddba/lib/python3.8/site-packages/wrapt_timeout_decorator/wrap_function_multiprocess.py", line 51, in cancel
raise_exception(self.wrap_helper.timeout_exception, self.wrap_helper.exception_message)
File "/home/exagriddba/lib/python3.8/site-packages/wrapt_timeout_decorator/wrap_helper.py", line 178, in raise_exception
raise exception(exception_message)
__main__.ConnectionTimedOut: Function test_session timed out after 4.0 seconds
Caution
Do not decorate the function that calls sessionmaker, or you will get:
_pickle.PicklingError: Can't pickle <class 'sqlalchemy.orm.session.Session'>: it's not the same object as sqlalchemy.orm.session.Session
SCAN
This implementation is a "connection timeout" without regard to underlying cause. The client could time out before trying all available SCAN listeners.

Celery add_periodic_task blocks Django running in uwsgi environment

I have written a module that dynamically adds periodic celery tasks based on a list of dictionaries in the projects settings (imported via django.conf.settings).
I do that using a function add_tasks that schedules a function to be called with a specific uuid which is given in the settings:
def add_tasks(celery):
for new_task in settings.NEW_TASKS:
celery.add_periodic_task(
new_task['interval'],
my_task.s(new_task['uuid']),
name='My Task %s' % new_task['uuid'],
)
Like suggested here I use the on_after_configure.connect signal to call the function in my celery.py:
app = Celery('my_app')
#app.on_after_configure.connect
def setup_periodic_tasks(celery, **kwargs):
from add_tasks_module import add_tasks
add_tasks(celery)
This setup works fine for both celery beat and celery worker but breaks my setup where I use uwsgi to serve my django application. Uwsgi runs smoothly until the first time when the view code sends a task using celery's .delay() method. At that point it seems like celery is initialized in uwsgi but blocks forever in the above code. If I run this manually from the commandline and then interrupt when it blocks, I get the following (shortened) stack trace:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'tasks'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'data'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'tasks'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
(SHORTENED HERE. Just contained the trace from the console through my call to this function)
File "/opt/my_app/add_tasks_module/__init__.py", line 42, in add_tasks
my_task.s(new_task['uuid']),
File "/usr/local/lib/python3.6/site-packages/celery/local.py", line 146, in __getattr__
return getattr(self._get_current_object(), name)
File "/usr/local/lib/python3.6/site-packages/celery/local.py", line 109, in _get_current_object
return loc(*self.__args, **self.__kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/app/__init__.py", line 72, in task_by_cons
return app.tasks[
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 44, in __get__
value = obj.__dict__[self.__name__] = self.__get(obj)
File "/usr/local/lib/python3.6/site-packages/celery/app/base.py", line 1228, in tasks
self.finalize(auto=True)
File "/usr/local/lib/python3.6/site-packages/celery/app/base.py", line 507, in finalize
with self._finalize_mutex:
It seems like there is a problem with acquiring a mutex.
Currently I am using a workaround to detect if sys.argv[0] contains uwsgi and then not add the periodic tasks, as only beat needs the tasks, but I would like to understand what is going wrong here to solve the problem more permanently.
Could this problem have something to do with using uwsgi multi-threaded or multi-processed where one thread/process holds the mutex the other needs?
I'd appreciate any hints that can help me solve the problem. Thank you.
I am using: Django 1.11.7 and Celery 4.1.0
Edit 1
I have created a minimal setup for this problem:
celery.py:
import os
from celery import Celery
from django.conf import settings
from myapp.tasks import my_task
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_app.settings')
app = Celery('my_app')
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(
60,
my_task.s(),
name='Testtask'
)
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
tasks.py:
from celery import shared_task
#shared_task()
def my_task():
print('ran')
Make sure that CELERY_TASK_ALWAYS_EAGER=False and that you have a working message queue.
Run:
./manage.py shell -c 'from myapp.tasks import my_task; my_task.delay()'
Wait about 10 seconds before interrupting to see the above error.

So, I have found out that the #shared_task decorator creates the problem. I can circumvent the problem when I declare the task right in the function called by the signal like so:
def add_tasks(celery):
#celery.task
def my_task(uuid):
print(uuid)
for new_task in settings.NEW_TASKS:
celery.add_periodic_task(
new_task['interval'],
my_task.s(new_task['uuid']),
name='My Task %s' % new_task['uuid'],
)
This solution is actually working for me, but I have one more problem with this: I use this code in a pluggable app, so I can't directly access the celery app outside of the signal handler but would like to also be able to call the my_task function from within other code. By defining it within the function it is not available outside of the function, so I cannot import it anywhere else.
I can probably work around this by defining the task function outside of the signal function, and use it with different decorators here and in the tasks.py. I am wondering though if there is a decorator apart from the #shared_task decorator that I can use in the tasks.py that does not create the problem.
The current best solution could be:
task_app.__init__.py:
def my_task(uuid):
# do stuff
print(uuid)
def add_tasks(celery):
celery_my_task = celery.task(my_task)
for new_task in settings.NEW_TASKS:
celery.add_periodic_task(
new_task['interval'],
celery_my_task(new_task['uuid']),
name='My Task %s' % new_task['uuid'],
)
task_app.tasks.py:
from celery import shared_task
from task_app import my_task
shared_my_task = shared_task(my_task)
myapp.celery.py:
import os
from celery import Celery
from django.conf import settings
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_app.settings')
app = Celery('my_app')
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
from task_app import add_tasks
add_tasks(sender)
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

Could you give a try that signal #app.on_after_finalize.connect:
some fast snippet from working project celery==4.1.0, Django==2.0, django-celery-beat==1.1.0 and django-celery-results==1.0.1
#app.on_after_finalize.connect
def setup_periodic_tasks(sender, **kwargs):
""" setup of periodic task :py:func:shopify_data_fetcher.celery.fetch_shopify
based on the schedule defined in: settings.CELERY_BEAT_SCHEDULE
"""
for task_name, task_config in settings.CELERY_BEAT_SCHEDULE.items():
sender.add_periodic_task(
task_config['schedule'],
fetch_shopify.s(**task_config['kwargs']['resource_name']),
name=task_name
)
piece of CELERY_BEAT_SCHEDULE:
CELERY_BEAT_SCHEDULE = {
'fetch_shopify_orders': {
'task': 'shopify.tasks.fetch_shopify',
'schedule': crontab(hour="*/3", minute=0),
'kwargs': {
'resource_name': shopify_constants.SHOPIFY_API_RESOURCES_ORDERS
}
}
}

Forking, sqlalchemy, and scoped sessions

I'm getting the following error (which I assume is because of the forking in my application), "This result object does not return rows".
Traceback
---------
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/dask/async.py", line 263, in execute_task
result = _execute_task(task, data)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/dask/async.py", line 245, in _execute_task
return func(*args2)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/smg/analytics/services/impact_analysis.py", line 140, in _do_impact_analysis_mp
Correlation.user_id.in_(user_ids)).all())
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2241, in all
return list(self)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 65, in instances
fetch = cursor.fetchall()
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 752, in fetchall
self.cursor, self.context)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1027, in _handle_dbapi_exception
util.reraise(*exc_info)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 746, in fetchall
l = self.process_rows(self._fetchall_impl())
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 715, in _fetchall_impl
self._non_result()
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 720, in _non_result
"This result object does not return rows. "
I'm using dask and it's multiprocessing scheduler (which uses multiprocessing.Pool).
As I understand it (based on the documentation), sessions created from a scoped session object (created via scoped_session()), are threadsafe. This is because they are threadlocal. This would lead me to believe that when I call Session() (or using the proxy Session) I'm getting a session object that only exists and is only accessible from the thread it was called from.
This seems pretty straight forward.
What I am confused about, is what is why I'm having issues when forking the process. I understand that you can't
re-use an engine across processes, so I've followed the event-based solution verbatim from the docs and done this:
class _DB(object):
_engine = None
#classmethod
def _get_engine(cls, force_new=False):
if cls._engine is None or force_new is True:
cfg = Config.get_config()
user = cfg['USER']
host = cfg['HOST']
password = cfg['PASSWORD']
database = cfg['DATABASE']
engine = create_engine(
'mysql://{}:{}#{}/{}?local_infile=1&'
'unix_socket=/var/run/mysqld/mysqld.sock'.
format(user, password, host, database),
pool_size=5, pool_recycle=3600)
cls._engine = engine
return cls._engine
# From the docs, handles multiprocessing
#event.listens_for(_DB._get_engine(), "connect")
def connect(dbapi_connection, connection_record):
connection_record.info['pid'] = os.getpid()
#From the docs, handles multiprocessing
#event.listens_for(_DB._get_engine(), "checkout")
def checkout(dbapi_connection, connection_record, connection_proxy):
pid = os.getpid()
if connection_record.info['pid'] != pid:
connection_record.connection = connection_proxy.connection = None
raise exc.DisconnectionError(
"Connection record belongs to pid %s, "
"attempting to check out in pid %s" %
(connection_record.info['pid'], pid)
)
# The following is how I create the scoped session object.
Session = scoped_session(sessionmaker(
bind=_DB._get_engine(), autocommit=False, autoflush=False))
Base = declarative_base()
Base.query = Session.query_property()
So my assumptions (based on the docs) are the following:
Using a session object created from a scoped session object, it must always give me a threadlocal session (which in my case would just be the main thread of the child process). Although not in the docs I imagine this should apply even if the scoped session object was created in another process.
The threadlocal session will get a connection from the pool via the engine, if the connection was not created within this process it will create a new one (based on the above connection() and checkout() implementations.)
If both of these things were true, then everything should "just work" (AFAICT). That's not the case though.
I managed to get it to work by creating a new scoped session object in each new process, and using it
in all subsequent calls using a session.
BTW the Base.query attribute needed to be updated from this new scoped session object as well.
I imagine that my #1 assumption above is incorrect. Can anyone help me understand why I need to create a new scoped session object in each process?
Cheers.

It is not clear when your fork happens but the most common issue is that the engine is created before the fork, which initializes a TCP connections to the database with your pool_size=5 which then gets copied over to the new processes and results in multiple processes interacting with the same physical sockets => troubles.
Options are to:
Disable the pool and use an on demand connection: poolclass=NullPool
Re-create the pool after fork: sqla_engine.dispose()
Delay the create_engine until after the fork

RQ Worker throwing "ValueError"

I'm attempting to get RQ/RQ-Worker running on my Flask application. I've tried to get it down to a very simple test case. Here's the general idea:
The user visits the /test page. Which triggers a job to be queued and returns the queued job's job_key
The worker (worker.py) processes the queued job.
The user can then visit the /retrieve/<job_key> page to retrieve the result. [This is not shown.]
The current job is just to add 2 + 2.
Here is the application code:
from rq import Queue
from rq.job import Job
# import conn from worker.py
from worker import conn
app = Flask(__name__)
q = Queue(connection=conn)
def add():
return 2+2
#app.route('/test')
def test():
job = q.enqueue_call(func="add", args=None, result_ttl=5000)
return job.get_id()
if __name__ == "__main__":
app.run()
The worker.py source code looks like this:
from redis import StrictRedis
from rq import Worker, Queue, Connection
listen = ['default']
redis_url = 'redis://localhost:6379'
conn = StrictRedis.from_url(redis_url)
if __name__ == "__main__":
with Connection(conn):
worker = Worker(list(map(Queue, listen)))
worker.work()
To my knowledge, the application code isn't the issue. I can visit the /test page which will enqueue the job. However, once I run the worker, I get the following error:
Traceback (most recent call last):
File "/home/<>/dev/sched/venv/lib/python3.5/site-packages/rq/worker.py", line 588, in perform_job
rv = job.perform()
File "/home/<>/dev/sched/venv/lib/python3.5/site-packages/rq/job.py", line 498, in perform
self._result = self.func(*self.args, **self.kwargs)
File "/home/<>/dev/sched/venv/lib/python3.5/site-packages/rq/job.py", line 206, in func
return import_attribute(self.func_name)
File "/home/<>/dev/sched/venv/lib/python3.5/site-packages/rq/utils.py", line 149, in import_attribute
module_name, attribute = name.rsplit('.', 1)
ValueError: not enough values to unpack (expected 2, got 1)
I feel like the line:
worker = Worker(list(map(Queue, listen)))
is the problem just b/c of the nature of the error, but I have no idea how to fix it. Especially b/c I've seen other projects that seem to use the exact same worker source code.
My technology stack is:
Flask (0.11.1)
Redis (2.10.5)
RQ (0.6.0)
RQ-Worker (0.0.1)
EDIT:
Beginning to think this is a bug. Check out this issue ticket in RQ's source: issue #531.

For me the issue was caused by RQ not being able to resolve my worker module.
The solution was to supply the "qualified" name to enqueue, e.g:
job = q.enqueue("app.worker.add", data)

MySQL server has gone away - Disconnect handling via checkout event handler doesn't work

Update 3/4:
I've done some testing and proved that using checkout event handler to check disconnects works with Elixir. Beginning to think my problem has something to do with calling session.commit() from a subprocess? Update: I just disproved myself by calling session.commit() in a subprocess, updated example below. I'm using the multiprocessing module to create the subprocess.
Here's the code that shows how it should work (without even using pool_recycle!):
from sqlalchemy import exc
from sqlalchemy import event
from sqlalchemy.pool import Pool
from elixir import *
import multiprocessing as mp
class SubProcess(mp.Process):
def run(self):
a3 = TestModel(name="monkey")
session.commit()
class TestModel(Entity):
name = Field(String(255))
#event.listens_for(Pool, "checkout")
def ping_connection(dbapi_connection, connection_record, connection_proxy):
cursor = dbapi_connection.cursor()
try:
cursor.execute("SELECT 1")
except:
# optional - dispose the whole pool
# instead of invalidating one at a time
# connection_proxy._pool.dispose()
# raise DisconnectionError - pool will try
# connecting again up to three times before raising.
raise exc.DisconnectionError()
cursor.close()
from sqlalchemy import create_engine
metadata.bind = create_engine("mysql://foo:bar#localhost/some_db", echo_pool=True)
setup_all(True)
subP = SubProcess()
a1 = TestModel(name='foo')
session.commit()
# pool size is now three.
print "Restart the server"
raw_input()
subP.start()
#a2 = TestModel(name='bar')
#session.commit()
Update 2:
I'm forced to find another solution as post 1.2.2 versions of MySQL-python drops support for the reconnect param. Anyone got a solution? :\
Update 1 (old-solution, doesn't work for MySQL-python versions > 1.2.2):
Found a solution: passing connect_args={'reconnect':True} to the create_engine call fixes the problem, automagically reconnects. Don't even seem to need the checkout event handler.
So, in the example from the question:
metadata.bind = create_engine("mysql://foo:bar#localhost/db_name", pool_size=100, pool_recycle=3600, connect_args={'reconnect':True})
Original question:
Done quite a bit of Googling for this problem and haven't seem to found a solution specific to Elixir - I'm trying to use the "Disconnect Handling - Pessimistic" example from the SQLAlchemy docs to handle MySQL disconnects. However, when I test this (by restarting the MySQL server), the "MySQL server has gone away" error is raised before before my checkout event handler.
Here's the code I use to initialize elixir:
##### Initialize elixir/SQLAlchemy
# Disconnect handling
from sqlalchemy import exc
from sqlalchemy import event
from sqlalchemy.pool import Pool
#event.listens_for(Pool, "checkout")
def ping_connection(dbapi_connection, connection_record, connection_proxy):
logging.debug("***********ping_connection**************")
cursor = dbapi_connection.cursor()
try:
cursor.execute("SELECT 1")
except:
logging.debug("######## DISCONNECTION ERROR #########")
# optional - dispose the whole pool
# instead of invalidating one at a time
# connection_proxy._pool.dispose()
# raise DisconnectionError - pool will try
# connecting again up to three times before raising.
raise exc.DisconnectionError()
cursor.close()
metadata.bind= create_engine("mysql://foo:bar#localhost/db_name", pool_size=100, pool_recycle=3600)
setup_all()
I create elixir entity objects and save them with session.commit(), during which I see the "ping_connection" message generated from the event defined above. However, when I restart the mysql server and test it again, it fails with the mysql server has gone away message just before the ping connection event.
Here's the stack trace starting from the relevant lines:
File "/usr/local/lib/python2.6/dist-packages/elixir/entity.py", line 1135, in get_by
return cls.query.filter_by(*args, **kwargs).first()
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/orm/query.py", line 1963, in first
ret = list(self[0:1])
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/orm/query.py", line 1857, in __getitem__
return list(res)
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/orm/query.py", line 2032, in __iter__
return self._execute_and_instances(context)
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/orm/query.py", line 2047, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/engine/base.py", line 1399, in execute
params)
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/engine/base.py", line 1532, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/engine/base.py", line 1640, in _execute_context
context)
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/engine/base.py", line 1633, in _execute_context
context)
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/engine/default.py", line 330, in do_execute
cursor.execute(statement, parameters)
File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 166, in execute
self.errorhandler(self, exc, value)
File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler
raise errorclass, errorvalue
OperationalError: (OperationalError) (2006, 'MySQL server has gone away')

The final workaround was calling session.remove() in the start of methods before manipulating and loading elixir entities. What this does is it will return the connection to the pool, so that when it's used again, the pool's checkout event will be fired, and our handler will detect the disconnection. From SQLAlchemy docs:
It’s not strictly necessary to remove the session at the end of the request - other options include calling Session.close(), Session.rollback(), Session.commit() at the end so that the existing session returns its connections to the pool and removes any existing transactional context. Doing nothing is an option too, if individual controller methods take responsibility for ensuring that no transactions remain open after a request ends.
Quite an important little piece of information I wish it were mentioned in the elixir docs. But then I guess it assumes prior knowledge with SQLAlchemy?

the actual problem is sqlalchemy giving you the same session every time you call the sessionmaker factory. Due to this it can happen that a later query is performed with a much earlier opened session as long as you did not call session.remove() on the session. Having to remember calling remove() every time you request a session however is no fun and sqlalchemy provides a much simpler thing: contexual "scoped" sessions.
To create a scoped session simply wrap your sessionmaker:
from sqlalchemy.orm import scoped_session, sessionmaker
Session = scoped_session(sessionmaker())
This way you get a contexual bound session every time you call the factory, meaning sqlalchemy calls the session.remove() for you as soon as the calling function exits. See here: sqlalchemy - lifespan of a contextual session

Are you using the same session for both (before and after mysqld restart) operations? If so, the "checkout" event occurs only when new transaction is started. When you call commit() the new transaction is started (unless you use autocommit mode) and connection is checked out. So you restart mysqld after checkout.
The simple hack with commit() or rollback() call just before the second operation (and after restarting mysqld) should solve your problem. Otherwise consider using new fresh session each time you wait long time after previous commit.

I'm not sure if this is the same problem that I had, but here goes:
When I encountered MySQL server has gone away, I solved it using create_engine(..., pool_recycle=3600), see http://www.sqlalchemy.org/docs/dialects/mysql.html#connection-timeouts

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python cassandra-driver OperationTimeOut on every query in Celery task - python

Using django-cassandra-engine the following resolved the issue for me: db_connection = connections['cassandra'] #worker_process_init.connect def connect_db(_): db_connection.reconnect() #worker_shutdown.connect def disconnect(_): db_connection.connection.close_all() look at here

Related

How to set a query timeout in sqlalchemy using Oracle database?

Celery add_periodic_task blocks Django running in uwsgi environment

Forking, sqlalchemy, and scoped sessions

RQ Worker throwing "ValueError"

MySQL server has gone away - Disconnect handling via checkout event handler doesn't work

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python cassandra-driver OperationTimeOut on every query in Celery task - python

Using django-cassandra-engine the following resolved the issue for me: db_connection = connections['cassandra'] #worker_process_init.connect def connect_db(**_): db_connection.reconnect() #worker_shutdown.connect def disconnect(**_): db_connection.connection.close_all() look at here

Related

How to set a query timeout in sqlalchemy using Oracle database?

Celery add_periodic_task blocks Django running in uwsgi environment

Forking, sqlalchemy, and scoped sessions

RQ Worker throwing "ValueError"

MySQL server has gone away - Disconnect handling via checkout event handler doesn't work

Categories

Resources

Using django-cassandra-engine the following resolved the issue for me: db_connection = connections['cassandra'] #worker_process_init.connect def connect_db(_): db_connection.reconnect() #worker_shutdown.connect def disconnect(_): db_connection.connection.close_all() look at here