how to configure celery executing tasks concurrently from on queue

how to configure celery executing tasks concurrently from on queue - python

In an environment with 8 cores, celery should be able to process 8 incoming tasks in parallel by default. But sometimes when new tasks are received celery place them behind a long running process.
I played around with default configuration, letting one worker consume from one queue.
celery -A proj worker --loglevel=INFO --concurrency=8
Is my understanding wrong, that one worker with a concurrency of 8 is able to process 8 tasks from one queue in parallel?
How is the preferred way to setup celery to prevent such behaviour described above?

To put it simply concurrency is the number of jobs running on a worker. Prefetch is the number of job sitting in a queue on a worker itself. You have 1 of 2 options here. The first is to set the prefetch multiplier down to 1. This will mean the worker will only keep, in your case, 8 additional jobs in it's queue. The second which I would recommend would be to create 2 different queues one for your short running tasks and another for your long running tasks.

Related

Tasks getting duplicated when using multiple celery workers with same queue

I'm using celery to run tasks that are small and big in nature.
Setup:
I'm using separate queues to handle small, medium, and large tasks independently.
There are different celery workers catering to each of the different queues.
Celery 5.2.7, Python 3.8.10
Using Redis as the broker.
Late ack set to True
Prefetch count set to 1
Visibility timeout set to max.
Celery worker started with: celery -A celeryapp worker --concurrency=1 -Ofair -l INFO -E -Q bigtask-queue -n big#%h
I'm facing an issue where the tasks are getting duplicated across multiple workers of the same type. I'm auto-scaling based on the load on the CPU.
For e.g, when I have 4 tasks with a maximum of 4 workers, each of those 4 tasks is being queued up for execution on each of the 4 workers. I.e, each task is getting executed 4 times, once on each machine sequentially.
What I want is for them to execute just once. If one worker has picked up 1 task from the queue, the same shouldn't be picked by another worker. A new task should be picked only once the new node is up.
I have played with existing answers where setting visibility timeout to the maximum value, setting prefetch task to 1 along with late ack set to True. Nothing has helped.
What am I missing?
Does celery not recognize that the same task has already been picked up by the other worker?
Will using a flag on Redis for each task status work? Will there not be a race condition if multiple workers are already running?
Are there any other solutions?

Do you have celery beat worker running?
something like this:
celery -A run.celery worker --loglevel=info --autoscale=5,2 -n app#beatworker --beat
We had the same problem, but now I don't remember how was it resolved. Try adding this separate worker with --beat option. there should be only one --beat running

Python distributed tasks with multiple queues

So the project I am working on requires a distributed tasks system to process CPU intensive tasks. This is relatively straight forward, spin up celery and throw all the tasks in a queue and have celery do the rest.
The issue I have is that every user needs their own queue, and items within each users queue must be processed synchronously. So it there is a task in a users queue already processing, wait until it is finished before allowing a worker to pick up the next.
The closest I've come to something like this is having a fixed set of queues, and assigning them to users. Then having the users tasks picked off by celery workers fixed to a certain queue with a concurrency of 1.
The problem with this system is that I can't scale my workers to process a backlog of user tasks.
Is there a way I can configure celery to do what I want, or perhaps another task system exists that does what I want?
Edit:
Currently I use the following command to spawn my celery workers with a concurrency of one on a fixed set of queues
celery multi start 4 -A app.celery -Q:1 queue_1 -Q:2 queue_2 -Q:3 queue_3 -Q:4 queue_4 --logfile=celery.log --concurrency=1
I then store a queue name on the user object, and when the user starts a process I queue a task to the queue stored on the user object. This gives me my synchronous tasks.
The downside is when I have multiple users sharing queues causing tasks to build up and never getting processed.
I'd like to have say 5 workers, and a queue per user object. Then have the workers just hop over the queues, but never have more than 1 worker on a single queue at a time.

I use chain doc here condition for execution task in a specific order :
chain = task1_task.si(account_pk) | task2_task.si(account_pk) | task3_task.si(account_pk)
chain()
So, i execute for a specific user task1 when its finished i execute task2 and when finished execute task3.
It will spawm in any worker available :)
For stopping a chain midway:
self.request.callbacks = None
return
And don't forget to bind your task :
#app.task(bind=True)
def task2_task(self, account_pk):

task with number of task can run along that in celery

As I see in celery, It can get number of tasks for a worker, that can run them at a same time.
I need run a task and set number of tasks can run simultaneously with this task.
Therefore, If I set this number to 2 and this task send to worker with 10 threads,
worker can run just one another task.

Worker will reserve tasks for each worker's tread. If you want to limit the number of tasks worker can execute the same time, you should configure your concurrency (e.g. to limit 1 task at the time, you need worker with 1 process -c 1).
You can also check prefetch configuration, but it only defines the number of tasks reserved for each process of the worker.
Here is Celery documentation where prefetch configuration explained:
http://celery.readthedocs.org/en/latest/userguide/optimizing.html

Celery task subprocesses fill up concurrency slots?

I am running a series of long-running heavy-weight Celery tasks (which spawn multiple subprocesses) in a queue with CELERYD_CONCURRENCY = 4. Initially, 4 tasks are started as they should. However, as tasks finish no new tasks are started until more finish and soon Celery keeps the amount of active tasks down to 1 or 2 until all tasks are complete (confirmed by Celery Flower).
When I only run simple tasks such as the default Celery add function everything works as expected.
Does the subprocesses started by Celery tasks (with same process group ID as the task) count to fill up the concurrency slots? Is there any way to make sure Celery only counts the tasks themselves?

Celery uses prefork as the default execution pool, and every time you spawn a subprocess (another fork), it counts up to the number of concurrent processes running, i.e. the number in CELERYD_CONCURRENCY.
The way to avoid this are by using eventlet, which will allow you to spawn multiple async calls on each task, as long as your tasks don't have any calls that block, like the subprocess.communicate.
To further optimize, you can try splitting the tasks that use subprocess.communicate into a different queue that has a worker using prefork and everything else that is doesn't block in a worker with eventlet.

Does the number of celeryd processes depend on the --concurrency setting?

We are running Celery behind Supervisor and start it with
celeryd --events --loglevel=INFO --concurrency=2
This, however, creates a process graph that is up to three layers deep and contains up to 7 celeryd processes (Supervisor spawns one celeryd, which spawns several others, which again spawn processes). Our machine has two CPU cores.
Are all of these processes working on tasks? Are maybe some of them just worker pools? How is the --concurrency setting connected to the number of processes actually spawned?

You shouldn't have 7 processes if --concurrency is 2.
The actual processes started is:
The main consumer process
Delegates work to the worker pool
The worker pool (this is the number that --concurrency decides)
So that is 3 processes with a concurrency of two.
In addition a very lightweight process used to clean up semaphores is started
if force_execv is enabled (which it is by default i you're using some other transport
than redis or rabbitmq).
NOTE that in some cases process listings also include threads.
the worker may start several threads if using transports other than rabbitmq/redis,
including one Mediator thread that is always started unless CELERY_DISABLE_RATE_LIMITS is enabled.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.