Detect whether Celery is Available/Running - python

I'm using Celery to manage asynchronous tasks. Occasionally, however, the celery process goes down which causes none of the tasks to get executed. I would like to be able to check the status of celery and make sure everything is working fine, and if I detect any problems display an error message to the user. From the Celery Worker documentation it looks like I might be able to use ping or inspect for this, but ping feels hacky and it's not clear exactly how inspect is meant to be used (if inspect().registered() is empty?).
Any guidance on this would be appreciated. Basically what I'm looking for is a method like so:
def celery_is_alive():
from celery.task.control import inspect
return bool(inspect().registered()) # is this right??
EDIT: It doesn't even look like registered() is available on celery 2.3.3 (even though the 2.1 docs list it). Maybe ping is the right answer.
EDIT: Ping also doesn't appear to do what I thought it would do, so still not sure the answer here.

Here's the code I've been using. celery.task.control.Inspect.stats() returns a dict containing lots of details about the currently available workers, None if there are no workers running, or raises an IOError if it can't connect to the message broker. I'm using RabbitMQ - it's possible that other messaging systems might behave slightly differently. This worked in Celery 2.3.x and 2.4.x; I'm not sure how far back it goes.
def get_celery_worker_status():
ERROR_KEY = "ERROR"
try:
from celery.task.control import inspect
insp = inspect()
d = insp.stats()
if not d:
d = { ERROR_KEY: 'No running Celery workers were found.' }
except IOError as e:
from errno import errorcode
msg = "Error connecting to the backend: " + str(e)
if len(e.args) > 0 and errorcode.get(e.args[0]) == 'ECONNREFUSED':
msg += ' Check that the RabbitMQ server is running.'
d = { ERROR_KEY: msg }
except ImportError as e:
d = { ERROR_KEY: str(e)}
return d

From the documentation of celery 4.2:
from your_celery_app import app
def get_celery_worker_status():
i = app.control.inspect()
availability = i.ping()
stats = i.stats()
registered_tasks = i.registered()
active_tasks = i.active()
scheduled_tasks = i.scheduled()
result = {
'availability': availability,
'stats': stats,
'registered_tasks': registered_tasks,
'active_tasks': active_tasks,
'scheduled_tasks': scheduled_tasks
}
return result
of course you could/should improve the code with error handling...

To check the same using command line in case celery is running as daemon,
Activate virtualenv and go to the dir where the 'app' is
Now run : celery -A [app_name] status
It will show if celery is up or not plus no. of nodes online
Source:
http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/

The following worked for me:
import socket
from kombu import Connection
celery_broker_url = "amqp://localhost"
try:
conn = Connection(celery_broker_url)
conn.ensure_connection(max_retries=3)
except socket.error:
raise RuntimeError("Failed to connect to RabbitMQ instance at {}".format(celery_broker_url))

One method to test if any worker is responding is to send out a 'ping' broadcast and return with a successful result on the first response.
from .celery import app # the celery 'app' created in your project
def is_celery_working():
result = app.control.broadcast('ping', reply=True, limit=1)
return bool(result) # True if at least one result
This broadcasts a 'ping' and will wait up to one second for responses. As soon as the first response comes in, it will return a result. If you want a False result faster, you can add a timeout argument to reduce how long it waits before giving up.

I found an elegant solution:
from .celery import app
try:
app.broker_connection().ensure_connection(max_retries=3)
except Exception as ex:
raise RuntimeError("Failed to connect to celery broker, {}".format(str(ex)))

You can use ping method to check whether any worker (or specific worker) is alive or not https://docs.celeryproject.org/en/latest/_modules/celery/app/control.html#Control.ping
celey_app.control.ping()

You can test on your terminal by running the following command.
celery -A proj_name worker -l INFO
You can review every time your celery runs.

The below script is worked for me.
#Import the celery app from project
from application_package import app as celery_app
def get_celery_worker_status():
insp = celery_app.control.inspect()
nodes = insp.stats()
if not nodes:
raise Exception("celery is not running.")
logger.error("celery workers are: {}".format(nodes))
return nodes

Run celery status to get the status.
When celery is running,
(venv) ubuntu#server1:~/project-dir$ celery status
-> celery#server1: OK
1 node online.
When no celery worker is running, you get the below information displayed in terminal.
(venv) ubuntu#server1:~/project-dir$ celery status
Error: No nodes replied within time constraint

Related

Django rq-scheduler: jobs in scheduler doesnt get executed

In my Heroku application I succesfully implemented background tasks. For this purpose I created a Queue object at the top of my views.py file and called queue.enqueue() in the appropriate view.
Now I'm trying to set a repeated job with rq-scheduler's scheduler.schedule() method. I know that it is not best way to do it but I call this method again at the top of my views.py file. Whatever I do, I couldn't get it to work, even if it's a simple HelloWorld function.
views.py:
from redis import Redis
from rq import Queue
from worker import conn
from rq_scheduler import Scheduler
scheduler = Scheduler(queue=q, connection=conn)
print("SCHEDULER = ", scheduler)
def say_hello():
print(" Hello world!")
scheduler.schedule(
scheduled_time=datetime.utcnow(), # Time for first execution, in UTC timezone
func=say_hello, # Function to be queued
interval=60, # Time before the function is called again, in seconds
repeat=10, # Repeat this number of times (None means repeat forever)
queue_name='default',
)
worker.py:
import os
import redis
from rq import Worker, Queue, Connection
import django
django.setup()
listen = ['high', 'default', 'low']
redis_url = os.getenv('REDISTOGO_URL')
if not redis_url:
print("Set up Redis To Go first. Probably can't get env variable REDISTOGO_URL")
raise RuntimeError("Set up Redis To Go first. Probably can't get env variable REDISTOGO_URL")
conn = redis.from_url(redis_url)
if __name__ == '__main__':
with Connection(conn):
print(" CREATING NEW WORKER IN worker.py")
worker = Worker(map(Queue, listen))
worker.work()
I'm checking the length of my queue before and after of schedule(), but it looks like length is always 0. I also can see that there are jobs when I call scheduler.get_jobs(), but those jobs doesn't get enqueued or performed I think.
I also don't want to use another cron solution for my project, as I already can do background tasks with rq, it shouldn't be that hard to implement a repeated task, or is it?
I went through documentation a couple times, now I feel so stuck, so I appretiate all the help or advices that I can get.
Using rq 1.6.1 and rq-scheduler 0.10.0 packages with Django 2.2.5 and Python 3.6.10
Edit: When I print jobs in scheduler, I see that their enqueued_at param is set to None, am I missing something really simple?

How to detect Celery Connection failure and switch to failover then back?

So our use case might be out of the remit of what Celery can do, but I thought I'd ask...
Use Case
We are planning on using a hosted/managed RabbitMQ cluster backing which Celery will be using for it's broker.
We want to ensure that our app has 0 downtime (obviously) so we're trying to figure out how we can handle the event when our upstream cluster has a catastrophic failure whereby the entire cluster is unavailable.
Our thought is that we have a standby Rabbit cluster that when the connection drops, we can automatically switch Celery to use that connection instead.
In the meantime, Celery is determining whether the master cluster is up and running and when it is, all of the publishers reconnect to the master, the workers drain the backup cluster and when empty, switch back onto the master.
The issue
What I'm having difficulty with is capturing the connection failure as it seems to happen deep within celery as the Exception doesn't bubble up to the app.
I can see that Celery has a BROKER_FAILOVER_STRATEGY configuration property, which would handle the initial swap, but it (seemingly) is only utilised when failover occurs, which doesn't fit our use case of swapping back to the master when it is back up.
I've also come across Celery's "bootsteps", but these are applied after Celery's own "Connection" bootstep which is where the exception is being thrown.
I have a feeling this approach is probably not the best one given the limitations I've been finding, but has anyone got any ideas on how I'd go about overriding the default Connection bootstep or achieving this via a different means?
It's quite old, but maybe useful to someone. I'm usin FastApi with Celery 5.2.
run_api.py file:
import uvicorn
if __name__ == "__main__":
port=8893
print("Starting API server on port {}".format(port))
uvicorn.run("endpoints:app", host="localhost", port=port, access_log=False)
endpoints.py file:
import threading
import time
import os
from celery import Celery
from fastapi import FastAPI
import itertools
import random
# Create object for fastAPI
app = FastAPI()
# Create and onfigure Celery to manage queus
# ----
celery = Celery(__name__)
celery.conf.broker_url = ["redis://localhost:6379"]
celery.conf.result_backend = "redis://localhost:6379"
celery.conf.task_track_started = True
celery.conf.task_serializer = "pickle"
celery.conf.result_serializer = "pickle"
celery.conf.accept_content = ["pickle"]
def random_failover_strategy(servers):
# The next line is necessary to work, even you don't use them:
it = list(servers) # don't modify callers list
shuffle = random.shuffle
for _ in itertools.repeat(None):
# Do whatever action required here to obtain the new url
# As an example, ra.
ra = random.randint(0, 100)
it = [f"redis://localhost:{str(ra)}"]
celery.conf.result_backend = it[0]
shuffle(it)
yield it[0]
celery.conf.broker_failover_strategy = random_failover_strategy
# Start the celery worker. I start it in a separate thread, so fastapi can run in parallel
worker = celery.Worker()
def start_worker():
worker.start()
ce = threading.Thread(target=start_worker)
ce.start()
# ----
#app.get("/", tags=["root"])
def root():
return {"message": ""}
#app.post("/test")
def test(num: int):
task = test_celery.delay(num)
print(f'task id: {task.id}')
return {
"task_id": task.id,
"task_status": "PENDING"}
#celery.task(name="test_celery", bind=True)
def test_celery(self, num):
self.update_state(state='PROGRESS')
print("ENTERED PROCESS", num)
time.sleep(100)
print("EXITING PROCESS", num)
return {'number': num}
#app.get("/result")
def result(id: str):
task_result = celery.AsyncResult(id)
if task_result.status == "SUCCESS":
return {
"task_status": task_result.status,
"task_num": task_result.result['number']
}
else:
return {
"task_status": task_result.status,
"task_num": None
}
Place both files in the same folder. Run python3 run_api.py.
Enjoy!

How to check if Celery/Supervisor is running using Python

How to write a script in Python that outputs if celery is running on a machine (Ubuntu)?
My use-case. I have a simple python file with some tasks. I'm not using Django or Flask. I use supervisor to run the task queue. For example,
tasks.py
from celery import Celery, task
app = Celery('tasks')
#app.task()
def add_together(a, b):
return a + b
Supervisor:
[program:celery_worker]
directory = /var/app/
command=celery -A tasks worker info
This all works, I now want to have page which checks if celery/supervisor process is running. i.e. something like this maybe using Flask allowing me to host the page giving a 200 status allowing me to load balance.
For example...
check_status.py
from flask import Flask
app = Flask(__name__)
#app.route('/')
def status_check():
#check supervisor is running
if supervisor:
return render_template('up.html')
else:
return render_template('down.html')
if __name__ == '__main__':
app.run()
Update 09/2020: Jérôme updated this answer for Celery 4.3 here: https://stackoverflow.com/a/57628025/1159735
You can run the celery status command via code by importing the celery.bin.celery package:
import celery
import celery.bin.base
import celery.bin.celery
import celery.platforms
app = celery.Celery('tasks', broker='redis://')
status = celery.bin.celery.CeleryCommand.commands['status']()
status.app = status.get_app()
def celery_is_up():
try:
status.run()
return True
except celery.bin.base.Error as e:
if e.status == celery.platforms.EX_UNAVAILABLE:
return False
raise e
if __name__ == '__main__':
if celery_is_up():
print('Celery up!')
else:
print('Celery not responding...')
How about using subprocess, not sure if it is a good idea:
>>> import subprocess
>>> output = subprocess.check_output('ps aux'.split())
>>> 'supervisord' in output
True
you can parse process state from supervisorctl status output
import subprocess
def is_celery_worker_running():
ctl_output = subprocess.check_output('supervisorctl status celery_worker'.split()).strip()
if ctl_output == 'unix:///var/run/supervisor.sock no such file':
# supervisord not running
return False
elif ctl_output == 'No such process celery_worker':
return False
else:
state = ctl_output.split()[1]
return state == 'RUNNING'
Inspired by #vgel's answer, using Celery 4.3.0.
import celery
import celery.bin.base
import celery.bin.control
import celery.platforms
# Importing Celery app from my own application
from my_app.celery import app as celery_app
def celery_running():
"""Test Celery server is available
Inspired by https://stackoverflow.com/a/33545849
"""
status = celery.bin.control.status(celery_app)
try:
status.run()
return True
except celery.bin.base.Error as exc:
if exc.status == celery.platforms.EX_UNAVAILABLE:
return False
raise
if __name__ == '__main__':
if celery_is_up():
print('Celery up!')
else:
print('Celery not responding...')
A sparse web user interface comes with supervisor. May be you could use that. It can be enabled in the supervisor config. Key to look for is [inet_http_server]
You could even look at the source code of that piece to get ideas to implement your own.
This isn't applicable for celery, but for anyone that ended up here to see if supervisord is running, check to see if the pidfile defined for supervisord in your supervisord.conf configuration file exists. If so, it's running; if not, it isn't. The default pidfile is /tmp/supervisord.pid, which is what I used below.
import os
import sys
if os.path.isfile("/tmp/supervisord.pid"):
print "supervisord is running."
sys.exit()
In my experience, I'd set a message to track whether it was complete or not so that the queues would be responsible for retrying tasks.

Can't get "retry" to work in Celery

Given a file myapp.py
from celery import Celery
celery = Celery("myapp")
celery.config_from_object("celeryconfig")
#celery.task(default_retry_delay=5 * 60, max_retries=12)
def add(a, b):
with open("try.txt", "a") as f:
f.write("A trial = {}!\n".format(a + b))
raise add.retry([a, b])
Configured with a celeryconfig.py
CELERY_IMPORTS = ["myapp"]
BROKER_URL = "amqp://"
CELERY_RESULT_BACKEND = "amqp"
I call in the directory that have both files:
$ celeryd -E
And then
$ python -c "import myapp; myapp.add.delay(2, 5)"
or
$ celery call myapp.add --args="[2, 5]"
So the try.txt is created with
A trial = 7!
only once.
That means, the retry was ignored.
I tried many other things:
Using MongoDB as broker and backend and inspecting the database (strangely enough, I can't see anything in my broker "messages" collection even in a "countdown"-scheduled job)
The PING example in here, both with RabbitMQ and MongoDB
Printing on the screen with both print (like the PING example) and logging
Make the retry call in an except block after an enforced Exception is raised, raising or returning the retry(), changing the "throw" parameter to True/False/not specified.
Seeing what's happening with celery flower (in which the "broker" link shows nothing)
But none happened to work =/
My celery report output:
software -> celery:3.0.19 (Chiastic Slide) kombu:2.5.10 py:2.7.3
billiard:2.7.3.28 py-amqp:N/A
platform -> system:Linux arch:64bit, ELF imp:CPython
loader -> celery.loaders.default.Loader
settings -> transport:amqp results:amqp
Is there anything wrong above? What I need to do to make the retry() method work?

celery get tasks count

I am using python celery+rabbitmq. I can't find a way to get task count in some queue.
Some thing like this:
celery.queue('myqueue').count()
Is it posible to get tasks count from certaint queue?
One solution is to run external command from my python scrpit:
"rabbitmqctl list_queues -p my_vhost"
and parse results, is it good way to do this?
I suppose that using rabbitmqctl command is not good solution, especially on my ubuntu server, where rabbitmqctl can be executed only with root privileges.
By playing with pika objects I found working solution:
import pika
from django.conf import settings
def tasks_count(queue_name):
''' Connects to message queue using django settings and returns count of messages in queue with name queue_name. '''
credentials = pika.PlainCredentials(settings.BROKER_USER, settings.BROKER_PASSWORD)
parameters = pika.ConnectionParameters( credentials=credentials,
host=settings.BROKER_HOST,
port=settings.BROKER_PORT,
virtual_host=settings.BROKER_VHOST)
connection = pika.BlockingConnection(parameters=parameters)
channel = connection.channel()
queue = channel.queue_declare(queue=queue_name, durable=True)
message_count = queue.method.message_count
return message_count
I did not find documentation about inspecting the AMQP queue with pika, so I do not know about solution's correctness.

Categories

Resources