I'm using RQ, and I have a failed queue with thousands of items, and another test queue I created a while back for testing which is now empty and unused. I'm wondering how to remove all jobs from the failed queue, and delete the test queue altogether?
Apologies for the basic question, but I can't find info on this in the RQ docs, and I'm completely new to both Redis and RQ... Thanks in advance!
Cleanup using rq
RQ offers methods to make any queue empty:
>>> from redis import Redis
>>> from rq import Queue
>>> qfail = Queue("failed", connection=Redis())
>>> qfail.count
8
>>> qfail.empty()
8L
>>> qfail.count
0
You can do the same for test queue, if you have it still present.
Cleanup using rq-dashboard
Install rq-dashboard:
$ pip install rq-dashboard
Start it:
$ rq-dashboard
RQ Dashboard, version 0.3.4
* Running on http://0.0.0.0:9181/
Open in browser.
Select the queue
Click the red button "Empty"
And you are done.
Python function Purge jobs
If you run too old Redis, which fails on command used by RQ, you still might sucess with deleting
jobs by python code:
The code takes a name of a queue, where are job ids.
Usilg LPOP we ask for job ids by one.
Adding prefix (by default "rq:job:") to job id we have a key, where is job stored.
Using DEL on each key we purge our database job by job.
>>> import redis
>>> r = redis.StrictRedis()
>>> qname = "rq:queue:failed"
>>> def purgeq(r, qname):
... while True:
... jid = r.lpop(qname)
... if jid is None:
... break
... r.delete("rq:job:" + jid)
... print(jid)
...
>>> purge(r, qname)
a0be3624-86c1-4dc4-bb2e-2043d2734b7b
3796c312-9b02-4a77-be89-249aa7325c25
ca65f2b8-044c-41b5-b5ac-cefd56699758
896f70a7-9a35-4f6b-b122-a08513022bc5
- 2016 -
You can now use rq's empty option form command line:
/path/to/rq empty queue_name
So you can use it to empty any queue not just the failed one
none of the above solutions worked
failed Queue is not registered under queues
so I move all of the failed jobs to default Queue and use
rq empty queue_name --url [redis-url]
Monitoring tool rqinfo can empty failed queue.
Just make sure you have an active virtualenv with rq installed, and run
$ rqinfo --empty-failed-queue
See rqinfo --help for more details.
you can just login to redis and clear all queues
to login
user#user:~$ redis-cli
enter this command and hit enter
FLUSHALL
And you're done
Edit: This will delete everything stored in redis
Here's how to clear the failed job registry using django_rq:
import django_rq
from rq.registry import FailedJobRegistry
queue = django_rq.get_queue("your_queue_with_failed_jobs")
registry = FailedJobRegistry(queue=queue)
for job_id in registry.get_job_ids():
registry.remove(job_id)
- 2022 -
I was struggling with this as well and this is a piece of code which works for me.
It loops over queues name (in my case, 'default' and 'low'), fetch all failed jobs for each queue and remove them
import django_rq
from rq.registry import FailedJobRegistry
from redis import Redis
from rq.job import Job
from django.conf import settings
redis = Redis(host=settings.REDIS_HOST, port=settings.REDIS_PORT)
queues = ["default", "low"]
for q in queues:
queue = django_rq.get_queue(q)
registry = FailedJobRegistry(queue=queue)
for job_id in registry.get_job_ids():
job = Job.fetch(job_id, connection=redis)
registry.remove(job)
By default 'rq' jobs are prefixed by 'rq:job'. So you can delete these jobs from the redis using following command,
redis-cli KEYS rq:job:* | xargs redis-cli DEL
Related
In my Heroku application I succesfully implemented background tasks. For this purpose I created a Queue object at the top of my views.py file and called queue.enqueue() in the appropriate view.
Now I'm trying to set a repeated job with rq-scheduler's scheduler.schedule() method. I know that it is not best way to do it but I call this method again at the top of my views.py file. Whatever I do, I couldn't get it to work, even if it's a simple HelloWorld function.
views.py:
from redis import Redis
from rq import Queue
from worker import conn
from rq_scheduler import Scheduler
scheduler = Scheduler(queue=q, connection=conn)
print("SCHEDULER = ", scheduler)
def say_hello():
print(" Hello world!")
scheduler.schedule(
scheduled_time=datetime.utcnow(), # Time for first execution, in UTC timezone
func=say_hello, # Function to be queued
interval=60, # Time before the function is called again, in seconds
repeat=10, # Repeat this number of times (None means repeat forever)
queue_name='default',
)
worker.py:
import os
import redis
from rq import Worker, Queue, Connection
import django
django.setup()
listen = ['high', 'default', 'low']
redis_url = os.getenv('REDISTOGO_URL')
if not redis_url:
print("Set up Redis To Go first. Probably can't get env variable REDISTOGO_URL")
raise RuntimeError("Set up Redis To Go first. Probably can't get env variable REDISTOGO_URL")
conn = redis.from_url(redis_url)
if __name__ == '__main__':
with Connection(conn):
print(" CREATING NEW WORKER IN worker.py")
worker = Worker(map(Queue, listen))
worker.work()
I'm checking the length of my queue before and after of schedule(), but it looks like length is always 0. I also can see that there are jobs when I call scheduler.get_jobs(), but those jobs doesn't get enqueued or performed I think.
I also don't want to use another cron solution for my project, as I already can do background tasks with rq, it shouldn't be that hard to implement a repeated task, or is it?
I went through documentation a couple times, now I feel so stuck, so I appretiate all the help or advices that I can get.
Using rq 1.6.1 and rq-scheduler 0.10.0 packages with Django 2.2.5 and Python 3.6.10
Edit: When I print jobs in scheduler, I see that their enqueued_at param is set to None, am I missing something really simple?
I have a SQS queue on a LocalStack server and I'm trying to consume messages from it with a Celery consumer.
It seams that the consumer is properly attached to the queue, for example the queue sqs-test-queue, but it does not receive any message when I try to send one with aws command.
My celeryconfig.py looks like this:
from kombu import (
Exchange,
Queue
)
broker_transport_options = {'region': REGION}
broker_transport = 'sqs'
accept_content = ['application/json']
result_serializer = 'json'
content_encoding = 'utf-8'
task_serializer = 'json'
worker_enable_remote_control = False
worker_send_task_events = False
result_backend = None
task_queues = (
Queue('sqs-test-queue', exchange=Exchange(''), routing_key='sqs-test-queue'),
)
and my tasks.py module looks like this:
from celery import Celery
from kombu.utils.url import quote
AWS_ACCESS_KEY = quote("AWS_ACCESS_KEY")
AWS_SECRET_KEY = quote("AWS_SECRET_KEY")
LOCALSTACK = "<IP>:<PORT>"
broker_url = "sqs://{access}:{secret}#{host}".format(access=AWS_ACCESS_KEY,
secret=AWS_SECRET_KEY,
host=LOCALSTACK)
app = Celery('tasks', broker=broker_url, backend=None)
app.config_from_object('celeryconfig')
#app.task(bind=True, name='tasks.consume', acks_late=True, ignore_result=True)
def consume(self, msg):
# DO SOMETHING WITH THE RECEIVED MESSAGE
return True
Tried to execute it with celery -A tasks worker -l INFO -Q sqs-test-queue and everything seams OK:
...
[tasks]
. tasks.consume
[... INFO/MainProcess] Connected to sqs://AWS_ACCESS_KEY:**#<IP>:<PORT>//
[... INFO/MainProcess] celery#local ready
but when I try to send a message with aws sqs send-message --endpoint-url=http://<IP>:<PORT> --queue-url=http://localhost:<PORT>/queue/sqs-test-queue --message-body="Test message", nothing happens.
What am I doing wrong? Have I missed something in the configuration maybe?
PS: If I try to run the command aws sqs receive-message --endpoint-url=http://<IP>:<PORT> --queue-url=http://localhost:<PORT>/queue/sqs-test-queue, I'm able to get the message.
NOTE:
I'm using Python 3.7.0 and my pip freeze looks like this:
boto3==1.10.16
botocore==1.13.16
celery==4.3.0
kombu==4.6.6
pycurl==7.43.0.3
...
I am going through the same thing as you. To fix it I did a couple of things:
I set the HOSTNAME_EXTERNAL and HOSTNAME env variables in localstack
Set broker_url to sqs://{access}:{secret}#{host}:{port} (as you have it)
Make sure that the celery worker's broker_transport_options does not include the config item: wait_time_seconds since this causes errors with localstack as of February 7th, 2020 (check issue here).
Once I did those two things, it started working, hope it helps.
Celery can't publish or consume arbitrary messages to/from any message queue system. Use kombu for that - that is what Celery uses behind the scenes too.
I'm using Celery to manage asynchronous tasks. Occasionally, however, the celery process goes down which causes none of the tasks to get executed. I would like to be able to check the status of celery and make sure everything is working fine, and if I detect any problems display an error message to the user. From the Celery Worker documentation it looks like I might be able to use ping or inspect for this, but ping feels hacky and it's not clear exactly how inspect is meant to be used (if inspect().registered() is empty?).
Any guidance on this would be appreciated. Basically what I'm looking for is a method like so:
def celery_is_alive():
from celery.task.control import inspect
return bool(inspect().registered()) # is this right??
EDIT: It doesn't even look like registered() is available on celery 2.3.3 (even though the 2.1 docs list it). Maybe ping is the right answer.
EDIT: Ping also doesn't appear to do what I thought it would do, so still not sure the answer here.
Here's the code I've been using. celery.task.control.Inspect.stats() returns a dict containing lots of details about the currently available workers, None if there are no workers running, or raises an IOError if it can't connect to the message broker. I'm using RabbitMQ - it's possible that other messaging systems might behave slightly differently. This worked in Celery 2.3.x and 2.4.x; I'm not sure how far back it goes.
def get_celery_worker_status():
ERROR_KEY = "ERROR"
try:
from celery.task.control import inspect
insp = inspect()
d = insp.stats()
if not d:
d = { ERROR_KEY: 'No running Celery workers were found.' }
except IOError as e:
from errno import errorcode
msg = "Error connecting to the backend: " + str(e)
if len(e.args) > 0 and errorcode.get(e.args[0]) == 'ECONNREFUSED':
msg += ' Check that the RabbitMQ server is running.'
d = { ERROR_KEY: msg }
except ImportError as e:
d = { ERROR_KEY: str(e)}
return d
From the documentation of celery 4.2:
from your_celery_app import app
def get_celery_worker_status():
i = app.control.inspect()
availability = i.ping()
stats = i.stats()
registered_tasks = i.registered()
active_tasks = i.active()
scheduled_tasks = i.scheduled()
result = {
'availability': availability,
'stats': stats,
'registered_tasks': registered_tasks,
'active_tasks': active_tasks,
'scheduled_tasks': scheduled_tasks
}
return result
of course you could/should improve the code with error handling...
To check the same using command line in case celery is running as daemon,
Activate virtualenv and go to the dir where the 'app' is
Now run : celery -A [app_name] status
It will show if celery is up or not plus no. of nodes online
Source:
http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/
The following worked for me:
import socket
from kombu import Connection
celery_broker_url = "amqp://localhost"
try:
conn = Connection(celery_broker_url)
conn.ensure_connection(max_retries=3)
except socket.error:
raise RuntimeError("Failed to connect to RabbitMQ instance at {}".format(celery_broker_url))
One method to test if any worker is responding is to send out a 'ping' broadcast and return with a successful result on the first response.
from .celery import app # the celery 'app' created in your project
def is_celery_working():
result = app.control.broadcast('ping', reply=True, limit=1)
return bool(result) # True if at least one result
This broadcasts a 'ping' and will wait up to one second for responses. As soon as the first response comes in, it will return a result. If you want a False result faster, you can add a timeout argument to reduce how long it waits before giving up.
I found an elegant solution:
from .celery import app
try:
app.broker_connection().ensure_connection(max_retries=3)
except Exception as ex:
raise RuntimeError("Failed to connect to celery broker, {}".format(str(ex)))
You can use ping method to check whether any worker (or specific worker) is alive or not https://docs.celeryproject.org/en/latest/_modules/celery/app/control.html#Control.ping
celey_app.control.ping()
You can test on your terminal by running the following command.
celery -A proj_name worker -l INFO
You can review every time your celery runs.
The below script is worked for me.
#Import the celery app from project
from application_package import app as celery_app
def get_celery_worker_status():
insp = celery_app.control.inspect()
nodes = insp.stats()
if not nodes:
raise Exception("celery is not running.")
logger.error("celery workers are: {}".format(nodes))
return nodes
Run celery status to get the status.
When celery is running,
(venv) ubuntu#server1:~/project-dir$ celery status
-> celery#server1: OK
1 node online.
When no celery worker is running, you get the below information displayed in terminal.
(venv) ubuntu#server1:~/project-dir$ celery status
Error: No nodes replied within time constraint
I am using python celery+rabbitmq. I can't find a way to get task count in some queue.
Some thing like this:
celery.queue('myqueue').count()
Is it posible to get tasks count from certaint queue?
One solution is to run external command from my python scrpit:
"rabbitmqctl list_queues -p my_vhost"
and parse results, is it good way to do this?
I suppose that using rabbitmqctl command is not good solution, especially on my ubuntu server, where rabbitmqctl can be executed only with root privileges.
By playing with pika objects I found working solution:
import pika
from django.conf import settings
def tasks_count(queue_name):
''' Connects to message queue using django settings and returns count of messages in queue with name queue_name. '''
credentials = pika.PlainCredentials(settings.BROKER_USER, settings.BROKER_PASSWORD)
parameters = pika.ConnectionParameters( credentials=credentials,
host=settings.BROKER_HOST,
port=settings.BROKER_PORT,
virtual_host=settings.BROKER_VHOST)
connection = pika.BlockingConnection(parameters=parameters)
channel = connection.channel()
queue = channel.queue_declare(queue=queue_name, durable=True)
message_count = queue.method.message_count
return message_count
I did not find documentation about inspecting the AMQP queue with pika, so I do not know about solution's correctness.
I need to have a python client that can discover queues on a restarted RabbitMQ server exchange, and then start up a clients to resume consuming messages from each queue. How can I discover queues from some RabbitMQ compatible python api/library?
There does not seem to be a direct AMQP-way to manage the server but there is a way you can do it from Python. I would recommend using a subprocess module combined with the rabbitmqctl command to check the status of the queues.
I am assuming that you are running this on Linux. From a command line, running:
rabbitmqctl list_queues
will result in:
Listing queues ...
pings 0
receptions 0
shoveled 0
test1 55199
...done.
(well, it did in my case due to my specific queues)
In your code, use this code to get output of rabbitmqctl:
import subprocess
proc = subprocess.Popen("/usr/sbin/rabbitmqctl list_queues", shell=True, stdout=subprocess.PIPE)
stdout_value = proc.communicate()[0]
print stdout_value
Then, just come up with your own code to parse stdout_value for your own use.
As far as I know, there isn't any way of doing this. That's nothing to do with Python, but because AMQP doesn't define any method of queue discovery.
In any case, in AMQP it's clients (consumers) that declare queues: publishers publish messages to an exchange with a routing key, and consumers determine which queues those routing keys go to. So it does not make sense to talk about queues in the absence of consumers.
You can add plugin rabbitmq_management
sudo /usr/lib/rabbitmq/bin/rabbitmq-plugins enable rabbitmq_management
sudo service rabbitmq-server restart
Then use rest-api
import requests
def rest_queue_list(user='guest', password='guest', host='localhost', port=15672, virtual_host=None):
url = 'http://%s:%s/api/queues/%s' % (host, port, virtual_host or '')
response = requests.get(url, auth=(user, password))
queues = [q['name'] for q in response.json()]
return queues
I'm using requests library in this example, but it is not significantly.
Also I found library that do it for us - pyrabbit
from pyrabbit.api import Client
cl = Client('localhost:15672', 'guest', 'guest')
queues = [q['name'] for q in cl.get_queues()]
Since I am a RabbitMQ beginner, take this with a grain of salt, but there's an interesting Management Plugin, which exposes an HTTP interface to "From here you can manage exchanges, queues, bindings, virtual hosts, users and permissions. Hopefully the UI is fairly self-explanatory."
http://www.rabbitmq.com/blog/2010/09/07/management-plugin-preview-release/
I use https://github.com/bkjones/pyrabbit. It's talks directly to RabbitMQ's mgmt plugin's API interface, and is very handy for interrogating RabbitMQ.
Management features are due in a future version of AMQP. So for now you will have to wait till for a new version that will come with that functionality.
I found this works for me, /els being my demo vhost name..
rabbitmqctl list_queues --vhost /els
pyrabbit didn't work so well for me; However, the Management Plugin itself has its own command line script that you can download from your own admin GUI and use later on (for example, I downloaded mine from
http://localhost:15672/cli/
for local use)
I would use simply this:
Just replace the user(default= guest), passwd(default= guest) and port with your values.
import requests
import json
def call_rabbitmq_api(host, port, user, passwd):
url = 'http://%s:%s/api/queues' % (host, port)
r = requests.get(url, auth=(user,passwd))
return r
def get_queue_name(json_list):
res = []
for json in json_list:
res.append(json["name"])
return res
if __name__ == '__main__':
host = 'rabbitmq_host'
port = 55672
user = 'guest'
passwd = 'guest'
res = call_rabbitmq_api(host, port, user, passwd)
print ("--- dump json ---")
print (json.dumps(res.json(), indent=4))
print ("--- get queue name ---")
q_name = get_queue_name(res.json())
print (q_name)
Referred from here: https://gist.github.com/hiroakis/5088513#file-example_rabbitmq_api-py-L2