So our use case might be out of the remit of what Celery can do, but I thought I'd ask...
Use Case
We are planning on using a hosted/managed RabbitMQ cluster backing which Celery will be using for it's broker.
We want to ensure that our app has 0 downtime (obviously) so we're trying to figure out how we can handle the event when our upstream cluster has a catastrophic failure whereby the entire cluster is unavailable.
Our thought is that we have a standby Rabbit cluster that when the connection drops, we can automatically switch Celery to use that connection instead.
In the meantime, Celery is determining whether the master cluster is up and running and when it is, all of the publishers reconnect to the master, the workers drain the backup cluster and when empty, switch back onto the master.
The issue
What I'm having difficulty with is capturing the connection failure as it seems to happen deep within celery as the Exception doesn't bubble up to the app.
I can see that Celery has a BROKER_FAILOVER_STRATEGY configuration property, which would handle the initial swap, but it (seemingly) is only utilised when failover occurs, which doesn't fit our use case of swapping back to the master when it is back up.
I've also come across Celery's "bootsteps", but these are applied after Celery's own "Connection" bootstep which is where the exception is being thrown.
I have a feeling this approach is probably not the best one given the limitations I've been finding, but has anyone got any ideas on how I'd go about overriding the default Connection bootstep or achieving this via a different means?
It's quite old, but maybe useful to someone. I'm usin FastApi with Celery 5.2.
run_api.py file:
import uvicorn
if __name__ == "__main__":
port=8893
print("Starting API server on port {}".format(port))
uvicorn.run("endpoints:app", host="localhost", port=port, access_log=False)
endpoints.py file:
import threading
import time
import os
from celery import Celery
from fastapi import FastAPI
import itertools
import random
# Create object for fastAPI
app = FastAPI()
# Create and onfigure Celery to manage queus
# ----
celery = Celery(__name__)
celery.conf.broker_url = ["redis://localhost:6379"]
celery.conf.result_backend = "redis://localhost:6379"
celery.conf.task_track_started = True
celery.conf.task_serializer = "pickle"
celery.conf.result_serializer = "pickle"
celery.conf.accept_content = ["pickle"]
def random_failover_strategy(servers):
# The next line is necessary to work, even you don't use them:
it = list(servers) # don't modify callers list
shuffle = random.shuffle
for _ in itertools.repeat(None):
# Do whatever action required here to obtain the new url
# As an example, ra.
ra = random.randint(0, 100)
it = [f"redis://localhost:{str(ra)}"]
celery.conf.result_backend = it[0]
shuffle(it)
yield it[0]
celery.conf.broker_failover_strategy = random_failover_strategy
# Start the celery worker. I start it in a separate thread, so fastapi can run in parallel
worker = celery.Worker()
def start_worker():
worker.start()
ce = threading.Thread(target=start_worker)
ce.start()
# ----
#app.get("/", tags=["root"])
def root():
return {"message": ""}
#app.post("/test")
def test(num: int):
task = test_celery.delay(num)
print(f'task id: {task.id}')
return {
"task_id": task.id,
"task_status": "PENDING"}
#celery.task(name="test_celery", bind=True)
def test_celery(self, num):
self.update_state(state='PROGRESS')
print("ENTERED PROCESS", num)
time.sleep(100)
print("EXITING PROCESS", num)
return {'number': num}
#app.get("/result")
def result(id: str):
task_result = celery.AsyncResult(id)
if task_result.status == "SUCCESS":
return {
"task_status": task_result.status,
"task_num": task_result.result['number']
}
else:
return {
"task_status": task_result.status,
"task_num": None
}
Place both files in the same folder. Run python3 run_api.py.
Enjoy!
Related
Hello fellow developers,
I'm actually trying to create a small webapp that would allow me to monitor multiple binance accounts from a dashboard and maybe in the futur perform some small automatic trading actions.
My frontend is implemented with Vue+quasar and my backend server is based on python Flask for the REST api.
What I would like to do is being able to start a background process dynamically when a specific endpoint of my server is called. Once this process is started on the server, I would like it to communicate via websocket with my Vue client.
Right now I can spawn the worker and create the websocket communication, but somehow, I can't figure out how to make all the threads in my worker to work all together. Let me get a bit more specific:
Once my worker is started, I'm trying to create at least two threads. One is the infinite loop allowing me to automate some small actions and the other one is the flask-socketio server that will handle the sockets connections. Here is the code of that worker :
customWorker.py
import time
from flask import Flask
from flask_socketio import SocketIO, send, emit
import threading
import json
import eventlet
# custom class allowing me to communicate with my mongoDD
from db_wrap import DbWrap
from binance.client import Client
from binance.exceptions import BinanceAPIException, BinanceWithdrawException, BinanceRequestException
from binance.websockets import BinanceSocketManager
def process_message(msg):
print('got a websocket message')
print(msg)
class customWorker:
def __init__(self, workerId, sleepTime, dbWrap):
self.workerId = workerId
self.sleepTime = sleepTime
self.socketio = None
self.dbWrap = DbWrap()
# this retrieves worker configuration from database
self.config = json.loads(self.dbWrap.get_worker(workerId))
keys = self.dbWrap.get_worker_keys(workerId)
self.binanceClient = Client(keys['apiKey'], keys['apiSecret'])
def handle_message(self, data):
print ('My PID is {} and I received {}'.format(os.getpid(), data))
send(os.getpid())
def init_websocket_server(self):
app = Flask(__name__)
socketio = SocketIO(app, async_mode='eventlet', logger=True, engineio_logger=True, cors_allowed_origins="*")
eventlet.monkey_patch()
socketio.on_event('message', self.handle_message)
self.socketio = socketio
self.app = app
def launch_main_thread(self):
while True:
print('My PID is {} and workerId {}'
.format(os.getpid(), self.workerId))
if self.socketio is not None:
info = self.binanceClient.get_account()
self.socketio.emit('my_account', info, namespace='/')
def launch_worker(self):
self.init_websocket_server()
self.socketio.start_background_task(self.launch_main_thread)
self.socketio.run(self.app, host="127.0.0.1", port=8001, debug=True, use_reloader=False)
Once the REST endpoint is called, the worker is spawned by calling birth_worker() method of "Broker" object available within my server :
from custom_worker import customWorker
#...
def create_worker(self, workerid, sleepTime, dbWrap):
worker = customWorker(workerid, sleepTime, dbWrap)
worker.launch_worker()
def birth_worker(workerid, 5, dbwrap):
p = Process(target=self.create_worker, args=(workerid,10, botPipe, dbWrap))
p.start()
So when this is done, the worker is launched in a separate process that successfully creates threads and listens for socket connection. But my problem is that I can't use my binanceClient in my main thread. I think that it is using threads and the fact that I use eventlet and in particular the monkey_patch() function breaks it. When I try to call the binanceClient.get_account() method I get an error AttributeError: module 'select' has no attribute 'poll'
I'm pretty sure about that it comes from monkey_patch because if I use it in the init() method of my worker (before patching) it works and I can get the account info. So I guess there is a conflict here that I've been trying to resolve unsuccessfully.
I've tried using only the thread mode for my socket.io app by using async_mode=threading but then, my flask-socketio app won't start and listen for sockets as the line self.socketio.run(self.app, host="127.0.0.1", port=8001, debug=True, use_reloader=False) blocks everything
I'm pretty sure I have an architecture problem here and that I shouldn't start my app by launching socketio.run. I've been unable to start it with gunicorn for example because I need it to be dynamic and call it from my python scripts. I've been struggling to find the proper way to do this and that's why I'm here today.
Could someone please give me a hint on how is this supposed to be achieved ? How can I dynamically spawn a subprocess that will manage a socket server thread, an infinite loop thread and connections with binanceClient ? I've been roaming stack overflow without success, every advice is welcome, even an architecture reforge.
Here is my environnement:
Manjaro Linux 21.0.1
pip-chill:
eventlet==0.30.2
flask-cors==3.0.10
flask-socketio==5.0.1
pillow==8.2.0
pymongo==3.11.3
python-binance==0.7.11
websockets==8.1
I am encountering an undesired delay in the celery process that I cannot explain. My intent is to manage live processing of incoming data (at a rate of 10 to 60 data per seconds). Processing of one piece of data is divided into two fully sequential tasks but parallelization is used to start processing the next piece of data (with task 1) while processing the current one (with task 2) is not finished yet. Getting the shortest delay in the process is of at-most importance since it is a live application.
Once in a while, I encounter a freeze in the process. To see where this problem came from I started monitoring the occupation of my workers. It appeared that it happened during the communication between workers. I designed the lightest and simplest example to illustrate it here.
Here is my code, as you can see I have two tasks doing nothing but waiting 10ms each. I call them by using celery chains once every 20ms. I track each workers occupation by using prerun and postrun along with logging. In most of the case all is happening sequentially as time spent by both the workers doesn't exceed the send rate.
from __future__ import absolute_import
import time
from celery import chain
from celery.signals import task_prerun, task_postrun
from celery import Celery
from kombu import Queue, Exchange
N_ITS = 100000 # Total number of chains sent
LOG_FILE = 'log_file.txt' # Path to the log file
def write_to_log_file(text):
with open(LOG_FILE, 'a') as f:
f.write(text)
# Create celery app
app = Celery('live')
app.config_from_object('celeryconfig')
default_exchange = Exchange('default', type='direct')
app.conf.task_queues = tuple(Queue(route['queue'], default_exchange, routing_key=route['queue'])
for route in app.conf.task_routes.values() + [{'queue': 'default'}])
app.conf.update(result_expires=3600)
# Define functions that record timings
#task_prerun.connect()
def task_prerun(signal=None, sender=None, task_id=None, task=None, **kwargs):
text = 'task_prerun; {0}; {1:.16g}\n'.format(task.name, time.time())
write_to_log_file(text)
#task_postrun.connect()
def task_postrun(signal=None, sender=None, task_id=None, task=None, **kwargs):
text = 'task_postrun; {0}; {1:.16g}\n'.format(task.name, time.time())
write_to_log_file(text)
# Define tasks
#app.task
def task_1(i):
print 'Executing task_1: {}'.format(i)
time.sleep(0.01)
#app.task
def task_2(i):
print 'Executing task_2: {}'.format(i)
time.sleep(0.01)
# Send chained tasks
def main():
celery_chains = []
for i in range(N_ITS):
print '[{}] - Dispatching tasks'.format(i)
celery_chains.append(chain(task_1.si(i) | task_2.si(i))())
time.sleep(0.02)
# wait for all tasks to complete
[c.get() for c in celery_chains]
if __name__ == '__main__':
main()
I also give the configuration of celery if needed:
from __future__ import absolute_import
import os
name = 'live'
broker_url = 'pyamqp://{}'.format(os.environ.get('RMQ_HOST', 'localhost'))
print 'broker_url:', broker_url
include = ['live']
DEFAULT_QUEUE = 'celery'
# A named queue that's not already defined in task_queues will be created automatically.
task_create_missing_queues = True
broker_pool_limit = 10000
task_routes = {
'live.task_1': {'queue': 'worker_1'},
'live.task_2': {'queue': 'worker_2'}
}
# We always set the routing key to be the queue name so we do it here automatically.
for v in task_routes.values():
v.update({'routing_key': v['queue']})
task_serializer = 'pickle'
result_serializer = 'pickle'
accept_content = ['json', 'pickle']
timezone = 'Europe/Paris'
enable_utc = True
For the broker, I use the docker image rabbitmq:3.6-alpine with basic configurations appart that I enabled rabbitmq_management.
This resuts in the following worker occupation chronogram: (the color indicates the index of the data being processed, so you can link tasks belonging to the same chain)
As you can see, usually everything goes well and task 2 is called right after task 1 is finished. However, sometimes (indicated by the arrows on the figure) task 2 doesn't start immediately even though worker 2 isn't occupied. It imputes a delay of 27ms, which is more than twice the duration of a single task. This happened approximately every 2 seconds during this execution.
I made some additionnal investigation using firehose to study the message exchange in rabbitmq and it resulted that the messages are effectively sent on time. To my understanding, the worker waits to go fetch the message and process the task, but I cannot understand why.
I tried setting the broker pool limit to a high number but the issue remains.
I'm new to celery. All of the examples I've seen start a celery worker from the command line. e.g:
$ celery -A proj worker -l info
I'm starting a project on elastic beanstalk and thought it would be nice to have the worker be a subprocess of my web app. I tried using multiprocessing and it seems to work. I'm wondering if this is a good idea, or if there might be some disadvantages.
import celery
import multiprocessing
class WorkerProcess(multiprocessing.Process):
def __init__(self):
super().__init__(name='celery_worker_process')
def run(self):
argv = [
'worker',
'--loglevel=WARNING',
'--hostname=local',
]
app.worker_main(argv)
def start_celery():
global worker_process
worker_process = WorkerProcess()
worker_process.start()
def stop_celery():
global worker_process
if worker_process:
worker_process.terminate()
worker_process = None
worker_name = 'celery#local'
worker_process = None
app = celery.Celery()
app.config_from_object('celery_app.celeryconfig')
Seems like a good option, definitely not the only option but a good one :)
One thing you might want to look into (you might already be doing this), is linking the autoscaling to the size of your Celery queue. So you only scale up when the queue is growing.
Effectively Celery does something similar internally of course, so there's not a lot of difference. The only snag I can think of is the handling of external resources (database connections for example), that might be a problem but is completely dependent on what you are doing with Celery.
If anyone is interested, I did get this working on Elastic Beanstalk with a pre-configured AMI server running Python 3.4. I had a lot of problems with the Docker based server running Debian Jessie. Something to do with port remapping, maybe. Docker is kind of a black box, and I've found it very hard to work with and debug. Fortunately, the good folks at AWS just added a non-docker Python 3.4 option on April 8, 2015.
I did a lot of searching to get this deployed and working. I saw lots of questions without answers. So here's my very simple deployed python 3.4/flask/celery process.
Celery you can just pip install. You'll need to install rabbitmq from a configuration file with a config command or container_command. I'm using a script in my uploaded project zip, so a container_command is necessary to use the script (regular eb config command takes place before the project is installed).
[yourapproot]/.ebextensions/05_install_rabbitmq.config:
container_commands:
01RunScript:
command: bash ./init_scripts/app_setup.sh
[yourapproot]/init_scripts/app_setup.sh:
#!/usr/bin/env bash
# Download and install Erlang
yum install erlang
# Download the latest RabbitMQ package using wget:
wget http://www.rabbitmq.com/releases/rabbitmq-server/v3.5.1/rabbitmq-server-3.5.1-1.noarch.rpm
# Install rabbit
rpm --import http://www.rabbitmq.com/rabbitmq-signing-key-public.asc
yum -y install rabbitmq-server-3.5.1-1.noarch.rpm
# Start server
/sbin/service rabbitmq-server start
I'm doing a flask app, so I startup the workers before the first request:
#app.before_first_request
def before_first_request():
task_mgr.start_celery()
The task_mgr creates the celery app object (which I call celery, since the flask app object is app). The -Ofair is pretty key here, for a simple task manager. There's all kinds of strange behavior with task prefetch. This should maybe be the default?
task_mgr/task_mgr.py:
import celery as celery_module
import multiprocessing
class WorkerProcess(multiprocessing.Process):
def __init__(self):
super().__init__(name='celery_worker_process')
def run(self):
argv = [
'worker',
'--loglevel=WARNING',
'--hostname=local',
'-Ofair',
]
celery.worker_main(argv)
def start_celery():
global worker_process
multiprocessing.set_start_method('fork') # 'spawn' seems to work also
worker_process = WorkerProcess()
worker_process.start()
def stop_celery():
global worker_process
if worker_process:
worker_process.terminate()
worker_process = None
worker_name = 'celery#local'
worker_process = None
celery = celery_module.Celery()
celery.config_from_object('task_mgr.celery_config')
My config is pretty simple so far:
task_mgr/celery_config.py:
BROKER_URL = 'amqp://'
CELERY_RESULT_BACKEND = 'amqp://'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json' # 'pickle' warning: can't use datetime in json
CELERY_RESULT_SERIALIZER = 'json' # 'pickle' warning: can't use datetime in json
CELERY_TASK_RESULT_EXPIRES = 18000 # Results hang around for 5 hours
CELERYD_CONCURRENCY = 4
Then you can put tasks wherever you need them:
from task_mgr.task_mgr import celery
import time
#celery.task(bind=True)
def error_task(self):
self.update_state(state='RUNNING')
time.sleep(10)
raise KeyError('im an error')
#celery.task(bind=True)
def long_task(self):
self.update_state(state='RUNNING')
time.sleep(20)
return 'long task finished'
#celery.task(bind=True)
def task_with_status(self, wait):
self.update_state(state='RUNNING')
for i in range(5):
time.sleep(wait)
self.update_state(
state='PROGRESS',
meta={
'current': i + 1,
'total': 5,
'status': 'progress',
'host': self.request.hostname,
}
)
time.sleep(wait)
return 'finished with wait = ' + str(wait)
I also keep a task queue to hold the async results so I can monitor the tasks:
task_queue = []
def queue_task(task, *args):
async_result = task.apply_async(args)
task_queue.append(
{
'task_name':task.__name__,
'task_args':args,
'async_result':async_result
}
)
return async_result
def get_tasks_info():
tasks = []
for task in task_queue:
task_name = task['task_name']
task_args = task['task_args']
async_result = task['async_result']
task_id = async_result.id
task_state = async_result.state
task_result_info = async_result.info
task_result = async_result.result
tasks.append(
{
'task_name': task_name,
'task_args': task_args,
'task_id': task_id,
'task_state': task_state,
'task_result.info': task_result_info,
'task_result': task_result,
}
)
return tasks
And of course, start the tasks where you need to:
from webapp.app import app
from flask import url_for, render_template, redirect
from webapp import tasks
from task_mgr import task_mgr
#app.route('/start_all_tasks')
def start_all_tasks():
task_mgr.queue_task(tasks.long_task)
task_mgr.queue_task(tasks.error_task)
for i in range(1, 9):
task_mgr.queue_task(tasks.task_with_status, i * 2)
return redirect(url_for('task_status'))
#app.route('/task_status')
def task_status():
current_tasks = task_mgr.get_tasks_info()
return render_template(
'parse/task_status.html',
tasks=current_tasks
)
And that's about it. Let me know if you need any help, though my celery knowledge is still fairly limited.
I'll start using django-rq in my project.
Django integration with RQ, a Redis based Python queuing library.
What is the best practice of testing django apps which is using RQ?
For example, if I want to test my app as a black box, after User makes some actions I want to execute all jobs in current Queue, and then check all results in my DB. How can I do it in my django-tests?
I just found django-rq, which allows you to spin up a worker in a test environment that executes any tasks on the queue and then quits.
from django.test impor TestCase
from django_rq import get_worker
class MyTest(TestCase):
def test_something_that_creates_jobs(self):
... # Stuff that init jobs.
get_worker().work(burst=True) # Processes all jobs then stop.
... # Asserts that the job stuff is done.
I separated my rq tests into a few pieces.
Test that I'm correctly adding things to the queue (using mocks).
Assume that if something gets added to the queue, it will eventually be processed. (rq's test suite should cover this).
Test, given the correct input, my tasks work as expected. (normal code tests).
Code being tested:
def handle(self, *args, **options):
uid = options.get('user_id')
# ### Need to exclude out users who have gotten an email within $window
# days.
if uid is None:
uids = User.objects.filter(is_active=True, userprofile__waitlisted=False).values_list('id', flat=True)
else:
uids = [uid]
q = rq.Queue(connection=redis.Redis())
for user_id in uids:
q.enqueue(mail_user, user_id)
My tests:
class DjangoMailUsersTest(DjangoTestCase):
def setUp(self):
self.cmd = MailUserCommand()
#patch('redis.Redis')
#patch('rq.Queue')
def test_no_userid_queues_all_userids(self, queue, _):
u1 = UserF.create(userprofile__waitlisted=False)
u2 = UserF.create(userprofile__waitlisted=False)
self.cmd.handle()
self.assertItemsEqual(queue.return_value.enqueue.mock_calls,
[call(ANY, u1.pk), call(ANY, u2.pk)])
#patch('redis.Redis')
#patch('rq.Queue')
def test_waitlisted_people_excluded(self, queue, _):
u1 = UserF.create(userprofile__waitlisted=False)
UserF.create(userprofile__waitlisted=True)
self.cmd.handle()
self.assertItemsEqual(queue.return_value.enqueue.mock_calls, [call(ANY, u1.pk)])
I commited a patch that lets you do:
from django.test impor TestCase
from django_rq import get_queue
class MyTest(TestCase):
def test_something_that_creates_jobs(self):
queue = get_queue(async=False)
queue.enqueue(func) # func will be executed right away
# Test for job completion
This should make testing RQ jobs easier. Hope that helps!
Just in case this would be helpful to anyone. I used a patch with a custom mock object to do the enqueue that would run right away
#patch django_rq.get_queue
with patch('django_rq.get_queue', return_value=MockBulkJobGetQueue()) as mock_django_rq_get_queue:
#Perform web operation that starts job. In my case a post to a url
Then the mock object just had one method:
class MockBulkJobGetQueue(object):
def enqueue(self, f, *args, **kwargs):
# Call the function
f(
**kwargs.pop('kwargs', None)
)
what I've done for this case is to detect if I'm testing, and use fakeredis during tests. finally, in the test itself, I enqueue the redis worker task in synch mode:
first, define a function that detects if you're testing:
TESTING = len(sys.argv) > 1 and sys.argv[1] == 'test'
def am_testing():
return TESTING
then in your file that uses redis to queue up tasks, manage the queue this way.
you could extend get_queue to specify a queue name if needed:
if am_testing():
from fakeredis import FakeStrictRedis
from rq import Queue
def get_queue():
return Queue(connection=FakeStrictRedis())
else:
import django_rq
def get_queue():
return django_rq.get_queue()
then, enqueue your task like so:
queue = get_queue()
queue.enqueue(task_mytask, arg1, arg2)
finally, in your test program, run the task you are testing in synch mode, so that it runs in the same process as your test. As a matter of practice, I first clear the fakeredis queue, but I don't think its necessary since there are no workers:
from rq import Queue
from fakeredis import FakeStrictRedis
FakeStrictRedis().flushall()
queue = Queue(async=False, connection=FakeStrictRedis())
queue.enqueue(task_mytask, arg1, arg2)
my settings.py has the normal django_redis settings, so django_rq.getqueue() uses these when deployed:
RQ_QUEUES = {
'default': {
'HOST': env_var('REDIS_HOST'),
'PORT': 6379,
'DB': 0,
# 'PASSWORD': 'some-password',
'DEFAULT_TIMEOUT': 360,
},
'high': {
'HOST': env_var('REDIS_HOST'),
'PORT': 6379,
'DB': 0,
'DEFAULT_TIMEOUT': 500,
},
'low': {
'HOST': env_var('REDIS_HOST'),
'PORT': 6379,
'DB': 0,
}
}
None of the answers above really solved how to test without having redis installed and using django settings. I found including the following code in the tests does not impact the project itself yet gives everything needed.
The code uses fakeredis to pretend there is a Redis service available, set up the connection before RQ Django reads the settings.
The connection must be the same because in fakeredis connections
do not share the state. Therefore, it is a singleton object to reuse it.
from fakeredis import FakeStrictRedis, FakeRedis
class FakeRedisConn:
"""Singleton FakeRedis connection."""
def __init__(self):
self.conn = None
def __call__(self, _, strict):
if not self.conn:
self.conn = FakeStrictRedis() if strict else FakeRedis()
return self.conn
django_rq.queues.get_redis_connection = FakeRedisConn()
def test_case():
...
You'll need your tests to pause while there are still jobs in the queue. To do this, you can check Queue.is_empty(), and suspend execution if there are still jobs in the queue:
import time
from django.utils.unittest import TestCase
import django_rq
class TestQueue(TestCase):
def test_something(self):
# simulate some User actions which will queue up some tasks
# Wait for the queued tasks to run
queue = django_rq.get_queue('default')
while not queue.is_empty():
time.sleep(5) # adjust this depending on how long your tasks take to execute
# queued tasks are done, check state of the DB
self.assert(.....)
I came across the same issue. In addition, I executed in my Jobs e.g. some mailing functionality and then wanted to check the Django test mailbox if there were any E-Mail. However, since the with Django RQ the jobs are not executed in the same context as the Django test, the emails that are sent do not end up in the test mailbox.
Therefore I need to execute the Jobs in the same context. This can be achieved by:
from django_rq import get_queue
queue = get_queue('default')
queue.enqueue(some_job_callable)
# execute input watcher
jobs = queue.get_jobs()
# execute in the same context as test
while jobs:
for job in jobs:
queue.remove(job)
job.perform()
jobs = queue.get_jobs()
# check no jobs left in queue
assert not jobs
Here you just get all the jobs from the queue and execute them directly in the test. One can nicely implement this in a TestCase Class and reuse this functionality.
I'm using Celery to manage asynchronous tasks. Occasionally, however, the celery process goes down which causes none of the tasks to get executed. I would like to be able to check the status of celery and make sure everything is working fine, and if I detect any problems display an error message to the user. From the Celery Worker documentation it looks like I might be able to use ping or inspect for this, but ping feels hacky and it's not clear exactly how inspect is meant to be used (if inspect().registered() is empty?).
Any guidance on this would be appreciated. Basically what I'm looking for is a method like so:
def celery_is_alive():
from celery.task.control import inspect
return bool(inspect().registered()) # is this right??
EDIT: It doesn't even look like registered() is available on celery 2.3.3 (even though the 2.1 docs list it). Maybe ping is the right answer.
EDIT: Ping also doesn't appear to do what I thought it would do, so still not sure the answer here.
Here's the code I've been using. celery.task.control.Inspect.stats() returns a dict containing lots of details about the currently available workers, None if there are no workers running, or raises an IOError if it can't connect to the message broker. I'm using RabbitMQ - it's possible that other messaging systems might behave slightly differently. This worked in Celery 2.3.x and 2.4.x; I'm not sure how far back it goes.
def get_celery_worker_status():
ERROR_KEY = "ERROR"
try:
from celery.task.control import inspect
insp = inspect()
d = insp.stats()
if not d:
d = { ERROR_KEY: 'No running Celery workers were found.' }
except IOError as e:
from errno import errorcode
msg = "Error connecting to the backend: " + str(e)
if len(e.args) > 0 and errorcode.get(e.args[0]) == 'ECONNREFUSED':
msg += ' Check that the RabbitMQ server is running.'
d = { ERROR_KEY: msg }
except ImportError as e:
d = { ERROR_KEY: str(e)}
return d
From the documentation of celery 4.2:
from your_celery_app import app
def get_celery_worker_status():
i = app.control.inspect()
availability = i.ping()
stats = i.stats()
registered_tasks = i.registered()
active_tasks = i.active()
scheduled_tasks = i.scheduled()
result = {
'availability': availability,
'stats': stats,
'registered_tasks': registered_tasks,
'active_tasks': active_tasks,
'scheduled_tasks': scheduled_tasks
}
return result
of course you could/should improve the code with error handling...
To check the same using command line in case celery is running as daemon,
Activate virtualenv and go to the dir where the 'app' is
Now run : celery -A [app_name] status
It will show if celery is up or not plus no. of nodes online
Source:
http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/
The following worked for me:
import socket
from kombu import Connection
celery_broker_url = "amqp://localhost"
try:
conn = Connection(celery_broker_url)
conn.ensure_connection(max_retries=3)
except socket.error:
raise RuntimeError("Failed to connect to RabbitMQ instance at {}".format(celery_broker_url))
One method to test if any worker is responding is to send out a 'ping' broadcast and return with a successful result on the first response.
from .celery import app # the celery 'app' created in your project
def is_celery_working():
result = app.control.broadcast('ping', reply=True, limit=1)
return bool(result) # True if at least one result
This broadcasts a 'ping' and will wait up to one second for responses. As soon as the first response comes in, it will return a result. If you want a False result faster, you can add a timeout argument to reduce how long it waits before giving up.
I found an elegant solution:
from .celery import app
try:
app.broker_connection().ensure_connection(max_retries=3)
except Exception as ex:
raise RuntimeError("Failed to connect to celery broker, {}".format(str(ex)))
You can use ping method to check whether any worker (or specific worker) is alive or not https://docs.celeryproject.org/en/latest/_modules/celery/app/control.html#Control.ping
celey_app.control.ping()
You can test on your terminal by running the following command.
celery -A proj_name worker -l INFO
You can review every time your celery runs.
The below script is worked for me.
#Import the celery app from project
from application_package import app as celery_app
def get_celery_worker_status():
insp = celery_app.control.inspect()
nodes = insp.stats()
if not nodes:
raise Exception("celery is not running.")
logger.error("celery workers are: {}".format(nodes))
return nodes
Run celery status to get the status.
When celery is running,
(venv) ubuntu#server1:~/project-dir$ celery status
-> celery#server1: OK
1 node online.
When no celery worker is running, you get the below information displayed in terminal.
(venv) ubuntu#server1:~/project-dir$ celery status
Error: No nodes replied within time constraint