Is there a way to get all tasks being added to Celery to perform one-after-the-next?
I have a bunch of celery tasks and they can happen at any time (they're triggered by users) and I would like for them to not all run at the same time as to lighten the load on my server.
The simplest way to achieve this is to have a dedicated worker with concurrency set to 1, subscribed to a "special" queue. Then you send your tasks that you want to run sequentially to this queue. celery multi (creates multiple workers on the same node) is especially useful for such use-cases.
Related
I want to categorize submitted tasks by custom tags, so that I would be able to kill all the tasks from one category. Is it possible to do that?
You can aggregate tasks with routing keys and queues. If you do that, you can purge an entire queue to kill all the tasks in that queue that are associated. The only caveat is that your production configuration will have to be modified so that new queues are created automatically.
So the project I am working on requires a distributed tasks system to process CPU intensive tasks. This is relatively straight forward, spin up celery and throw all the tasks in a queue and have celery do the rest.
The issue I have is that every user needs their own queue, and items within each users queue must be processed synchronously. So it there is a task in a users queue already processing, wait until it is finished before allowing a worker to pick up the next.
The closest I've come to something like this is having a fixed set of queues, and assigning them to users. Then having the users tasks picked off by celery workers fixed to a certain queue with a concurrency of 1.
The problem with this system is that I can't scale my workers to process a backlog of user tasks.
Is there a way I can configure celery to do what I want, or perhaps another task system exists that does what I want?
Edit:
Currently I use the following command to spawn my celery workers with a concurrency of one on a fixed set of queues
celery multi start 4 -A app.celery -Q:1 queue_1 -Q:2 queue_2 -Q:3 queue_3 -Q:4 queue_4 --logfile=celery.log --concurrency=1
I then store a queue name on the user object, and when the user starts a process I queue a task to the queue stored on the user object. This gives me my synchronous tasks.
The downside is when I have multiple users sharing queues causing tasks to build up and never getting processed.
I'd like to have say 5 workers, and a queue per user object. Then have the workers just hop over the queues, but never have more than 1 worker on a single queue at a time.
I use chain doc here condition for execution task in a specific order :
chain = task1_task.si(account_pk) | task2_task.si(account_pk) | task3_task.si(account_pk)
chain()
So, i execute for a specific user task1 when its finished i execute task2 and when finished execute task3.
It will spawm in any worker available :)
For stopping a chain midway:
self.request.callbacks = None
return
And don't forget to bind your task :
#app.task(bind=True)
def task2_task(self, account_pk):
I'm trying to implement a chatbot system, and I need to have my celery tasks processed sequentially per user. That means each user needs to have their messages sent as FIFO, but the users need to be processed randomly or in round-robin.
I've been reading about task chains, groups and trees, but all of these celery features seem to require providing all tasks at once, whereas I need to add tasks dynamically.
My reasoning was to have a dedicated queue per user, and enable concurrency on the queues. That way I can assure the delivery order and avoid one user blocking the rest of the chats.
Is there a way I can route tasks in Celery so I get the desired behavior? Ideally, I'd set up a worker to process the messages queue, and then route tasks to messages.contact.<contact_id> or the like.
There's no explicit mention to this behavior in the docs. Is it possible? Thanks!
In my application, I have python celery tasks that connect to a rest API.. simple.
The problem I have is that the API does not allow multiple resuests with the same credentials.
Is there a way to have these api tasks blocking in the queue? Meaning, If multiple requests are made around the same time, can I have the tasks sit in the queue and execute one by one, waiting for the first in the queue to finish?
Currently, in the rabbitmq message queue (with one worker), i see the tasks go through (spawned) and not wait.
I looked over documentation but could not find a simple solution.
Thanks.
With one worker it's impossible for celery to do more than one task at a time. what you may be seeing is called prefetching which allows the worker to reserve tasks.
http://docs.celeryproject.org/en/latest/userguide/optimizing.html#prefetch-limits
The default prefetch value is 4, turn it down to one and see if that fixes it.
Sometimes I have a situation where Celery queue builds up on accidental unnecessary tasks, clogging down the server. E.g. the code shoots up 20 000 tasks instead of 1.
How one can inspect what Python tasks Celery queue contains and then get selectively rid of certain tasks?
Tasks are defined and started with the standard Celery decorators (if that matters):
#task()
def update_foobar(foo, bar):
# Some heavy activon here
pass
update_foobar.delay(foo, bar)
Stack: Django + Celery + RabbitMQ.
Maybe you can use Flower. It's a real time monitor for Celery with a nice web interface. I think you can shutdown tasks from there. Anyways I would try to avoid those queued unnecessary tasks.