celery: remove empty queues that are more than 5 minutes old? - python

I am trying to clean up all the stale queues that linger. I want to remove queues that have been empty for over 5 minutes.
another way I was thinking of is using pyrabbit to access the queue directly but not sure how I can find out if a queue is older than 5 minutes.

You can do this from command line using
sudo rabbitmqctl set_policy expiry ".*" '{"expires":300000}' --apply-to queues
This deletes all unused queues after 300 seconds. Unused means the queue has no consumers, the queue has not been redeclared, and has not been invoked for a duration of at least the expiration period.
Note this expiry time can also be set when declaring a queue. More at rabbitmq docs.

Related

Everytime Celery is restarted all the scheduled tasks are acknowledged first and it takes 20 minutes if there are 100 thousands of tasks

I mean every time when we restart our Celery workers because of our App being deployed - they are not actually executing any tasks from the Q until all the tasks are received/acknowledged first. Which take a huge amount of time given our amount of the tasks in the Q - close to 100K tasks and close to 20 minutes of lag accordingly.
This question was already asked in 2016 here, here is what I can come up with given that, but I hope there is a better way:
We can support multiple celery Qs inside a single RMQ and do this:
tasks by default go to the new Q
upon every deploy we move all thousands of the tasks from the new to old Q (dunno how quickly this can be done)
different workers deal with old and new tasks
That way at least the new incoming tasks won't be waiting for the old ones to be acked. But imagine - what if in the old tasks there is a task which should be executed during this period while the old tasks are still being fed to the workers - this approach won't solve it.

Celery Django runing periodic tasks after previus was done. [django-celery-beat]

I want to use django-celery-beat library to make some changes in my database periodically. I set task to run each 10 minutes. Everything working fine till my task takes less than 10 minutes, if it lasts longer next tasks starts while first one is doing calculations and it couses an error.
my tasks loks like that:
from celery import shared_task
from .utils.database_blockchain import BlockchainVerify
#shared_task()
def run_function():
build_block = BlockchainVerify()
return "Database updated"
is there a way to avoid starting the same task if previous wasn't done ?
There is definitely a way. It's locking.
There is whole page in the celery documentation - Ensuring a task is only executed one at a time.
Shortly explained - you can use some cache or even database to put lock in and then every time some task starts just check if this lock is still in use or has been already released.
Be aware of that the task may fail or run longer than expected. Task failure may be handled by adding some expiration to the lock. And set the lock expiration to be long enough just in case the task is still running.
There already is a good thread on SO - link.

limited number of user-initiated background processes

I need to allow users to submit requests for very, very large jobs. We are talking 100 gigabytes of memory and 20 hours of computing time. This costs our company a lot of money, so it was stipulated that only 2 jobs could be running at any time, and requests for new jobs when 2 are already running would be rejected (and the user notified that the server is busy).
My current solution uses an Executor from concurrent.futures, and requires setting the Apache server to run only one process, reducing responsiveness (current user count is very low, so it's okay for now).
If possible I would like to use Celery for this, but I did not see in the documentation any way to accomplish this particular setting.
How can I run up to a limited number of jobs in the background in a Django application, and notify users when jobs are rejected because the server is busy?
I have two solutions for this particular case, one an out of the box solution by celery, and another one that you implement yourself.
You can do something like this with celery workers. In particular, you only create two worker processes with concurrency=1 (or well, one with concurrency=2, but that's gonna be threads, not different processes), this way, only two jobs can be done asynchronously. Now you need a way to raise exceptions if both jobs are occupied, then you use inspect, to count the number of active tasks and throw exceptions if required. For implementation, you can checkout this SO post.
You might also be interested in rate limits.
You can do it all yourself, using a locking solution of choice. In particular, a nice implementation that makes sure only two processes are running with redis (and redis-py) is as simple as the following. (Considering you know redis, since you know celery)
from redis import StrictRedis
redis = StrictRedis('localhost', '6379')
locks = ['compute:lock1', 'compute:lock2']
for key in locks:
lock = redis.lock(key, blocking_timeout=5)
acquired = lock.acquire()
if acquired:
do_huge_computation()
lock.release()
break
print("Gonna try next possible slot")
if not acquired:
raise SystemLimitsReached("Already at max capacity !")
This way you make sure only two running processes can exist in the system. A third processes will block in the line lock.acquire() for blocking_timeout seconds, if the locking was successful, acquired would be True, else it's False and you'd tell your user to wait !
I had the same requirement sometime in the past and what I ended up coding was something like the solution above. In particular
This has the least amount of race conditions possible
It's easy to read
Doesn't depend on a sysadmin, suddenly doubling the concurrency of workers under load and blowing up the whole system.
You can also implement the limit per user, meaning each user can have 2 simultaneous running jobs, by only changing the lock keys from compute:lock1 to compute:userId:lock1 and lock2 accordingly. You can't do this one with vanila celery.
First of all you need to limit concurrency on your worker (docs):
celery -A proj worker --loglevel=INFO --concurrency=2 -n <worker_name>
This will help to make sure that you do not have more than 2 active tasks even if you will have errors in the code.
Now you have 2 ways to implement task number validation:
You can use inspect to get number of active and scheduled tasks:
from celery import current_app
def start_job():
inspect = current_app.control.inspect()
active_tasks = inspect.active() or {}
scheduled_tasks = inspect.scheduled() or {}
worker_key = 'celery#%s' % <worker_name>
worker_tasks = active_tasks.get(worker_key, []) + scheduled_tasks.get(worker_key, [])
if len(worker_tasks) >= 2:
raise MyCustomException('It is impossible to start more than 2 tasks.')
else:
my_task.delay()
You can store number of currently executing tasks in DB and validate task execution based on it.
Second approach could be better if you want to scale your functionality - introduce premium users or do not allow to execute 2 requests by one user.
First
You need the first part of SpiXel's solution. According to him, "you only create two worker processes with concurrency=1".
Second
Set the time out for the task waiting in the queue, which is set CELERY_EVENT_QUEUE_TTL and the queue length limit according to how to limit number of tasks in queue and stop feeding when full?.
Therefore, when the two work running jobs, and the task in the queue waiting like 10 sec or any period time you like, the task will be time out. Or if the queue has been fulfilled, new arrival tasks will be dropped out.
Third
you need extra things to deal with notifying "users when jobs are rejected because the server is busy".
Dead Letter Exchanges is what you need. Every time a task is failed because of the queue length limit or message timeout. "Messages will be dropped or dead-lettered from the front of the queue to make room for new messages once the limit is reached."
You can set "x-dead-letter-exchange" to route to another queue, once this queue receive the dead lettered message, you can send a notification message to users.

Sending message to RabbitMQ (pika) from scheduler callback doesn't work

I need different messages to be sent to the queue - each by its personal schedule. So I have message list and related interval to resend each one. I use rabbitMQ/pika and apscheduler.
According to numerous examples, I created the simplest BlockingConnection/channel/queue. When immediately after that I try to push messages - everything works fine, I can see in rabbitmq web-interface that all the messages become in the queue. Here is the piece of code that works:
self.cr = Queue('DIRECT_C_QUEUE', True, ex_type='direct')
for i in range(1,10000):
self.cr.channel.basic_publish(exchange='', routing_key='DIRECT_C_QUEUE', body='hello_world')
But if I try to push messages (in exactly same way) via apscheduler callback function - only few (about 1-10) messages appear in the queue (but callbacks are fired all the time and there are no any exception when publishing message!).
Finally I begin to receive such warnings:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pika/connection.py:642: UserWarning: Pika: Write buffer exceeded warning threshold at 1125 bytes and an estimated 43 frames behind
warn(message % (self.outbound_buffer.size, est_frames_behind))
and still no new messages in the queue.
I am new in python, any help is much appreciate.
I found source of the problem:
apscheduler runs basic_publish calls in separate thread, and pika doesn't recommend share connections between threads - http://pika.github.com/faq.html
So I had choice either create new connection each time, or put new messages in some queue and publish them from the main thread (where connection was created).
I fixed that problem by increasing ulimit
edit /etc/default/rabbitmq-server and set
ulimit -n 4096
then restart rabbitmq
sudo /etc/init.d/rabbitmq-server restart

Is it possible to empty a job queue on a Gearman server

Is it possible to empty a job queue on a Gearman server? I am using the python driver for Gearman, and the documentation does not have any information about emptying queues. I would imagine that this functionality should exist, possibly, with a direct connection to the Gearman server.
I came across this method:
/usr/bin/gearman -t 1000 -n -w -f function_name > /dev/null
which basically dumps all the jobs into /dev/null.
The telnetable administrative protocol (search for "Administrative Protocol") doesn't have a command to empty a queue either, there is only a shutdown command.
If you wish to avoid downtime, you could write a generic "job consumer" worker and use that to empty the queues. I've set one up as a script which takes a list of job names, and just sits there accepting jobs and consuming them.
Something like:
# generic_consumer.py job1 job2 job3
You can use the administrative protocol's status command to get a list of the function names and counts on the queue. The administrative protocol docs tell you the format of the response.
# (echo status ; sleep 0.1) | netcat 127.0.0.1 4730
As far as i have been able to tell from the docs and using gearman with PHP, the only way to clear the job queue is to restart to the gearmand job server. If you are using persistent job queues, you will also need to empty whatever you are using as the persistent storage, if this is DB storage, you will need to empty the appropriate tables of all the rows.
stop gearmand --> empty table rows --> start gearmand
Hope this is clear enough.

Categories

Resources