This post is in continuation with my previous post - celery how to implement single queue with multiple workers executing in parallel?
I implemented celery to work with eventlet using this command :-
celery -A project worker -P eventlet -l info --concurrency=4
I can see that my tasks are getting moved to active list faster (In flower) but i am not sure if they are executing in parallel? I have a 4 core server for production but I am not utilizing all the cores at the same time.
My question is :-
how can I use all 4 cores to execute tasks in parallel?
Both eventlet/gevent worker types provide great solution for concurrency at the cost of stalling parallelism to 1. To have true parallel task execution and utilise cores, run several Celery instances on same machine.
I know this goes counter to what popular Linux distros have in mind, so just ignore system packages and roll your great configuration from scratch. Systemd service template is your friend.
Another option is to run Celery with prefork pool, you get parallelism at the cost of stalling concurrency to number of workers.
Related
It's been almost over two years that i use Celery in Production Web/Server with Django. It's been almost over two years that i search without success a solution to this problem : "How specify the numbers of threads available to celery ?"
I have 32 Threads on my production Server and 7 Celery Queues.
I use Celery on Centos OS managed by Supervisord like this
celery.ini
[program:Site_Web_celery-worker1]
command=/etc/supervisord.d/celery-worker1.sh
directory=/var/www/html/SiteWeb/Site_Web/
user=apache
numprocs=1
stdout_logfile=/var/log/celery/worker1.log
stderr_logfile=/var/log/celery/worker1.log
autostart=true
autorestart=true
priority=999
stopasgroup=true
The celery command line for the 1 first Queue.
celery -A Site_Web.celery_settings worker -l info --autoscale 22 -Q default -n worker1.%h
In resume:
How can i just specify at Celery to work only on the 30 first Threads and never use the 2 last Threads ?
Thanks in advance for any help and tips.
If I understood well, you want to set CPU affinity for every worker-process spawned by Celery. Celery does not support setting CPU affinity for its worker-processes and to do this manually you would have to spend large amount of time writing a monitoring tool that constantly "watches" Celery worker and its child processes and sets up CPU affinity using taskset or something similar.
I personally believe it is not worth the effort. Good reasons for setting CPU affinity are rare - trust your system's scheduler.
i have celery running on few computers and using flower for monitoring.
the computers is used by different people.
celery beat is generating jobs for all the workers from one of the computer.
every time new coded task is ready, all the workers less the beat-computer will have task not registered exception.
what is the recommended direction to sync all the code to all other computers in the network, is there a prehook kind of mechanism in celery to check for new code?
Unfortunately, you need to update the code on all the workers (nodes) and after that you need to restart all of them. This is by (good) design.
A clever systemd service could in theory be able to
send the graceful shutdown signal
run pip install -U your-project
start the Celery service
I'm new to RQ and am trying to use it for a job which will run in the background. I have managed to set it up, and I'm also able to start more than one worker.
Now I'm trying to run these workers concurrently. I installed supervisor and followed a tutorial to add programs to it, and it worked.
Here is my supervisor configuration:
[program:rqworker]
command=/usr/local/bin/rq worker mysql
process_name=rqworker1-%(process_num)s
numprocs=3
directory=/home/hp/Python/sample
stopsignal=TERM
autostart=true
autorestart=true
stdout_logfile=/home/hp/Python/sample/logs
The worker function is present in the sample directory mentioned above.
The problem is that even after specifying numprocs as 3 in the config file, the workers do not run in parallel.
Here are some screenshots, which show that although multiple workers have been started, they do not work in parallel.
Also, I saw this stackoverflow answer, but it still doesn't divide the jobs amongst the workers!
Could anyone tell me what is wrong with this configuration/what I need to change?
I found the problem; it wasn't with supervisor or rqworker. The manager program was blocking concurrency, by waiting for task completion!
I would like to run APScheduler which is a part of WSGI (via Apache's modwsgi with 3 workers) webapp. I am new in WSGI world thus I would appreciate if you could resolve my doubts:
If APScheduler is a part of webapp - it becomes alive just after first request (first after start/reset Apache) which is run at least by one worker? Starting/resetting Apache won't start it - at least one request is needed.
What about concurrent requests - would every worker run same set of APScheduler's tasks or there will be only one set shared between all workers?
Would once running process (webapp run via worker) keep alive (so APScheduler's tasks will execute) or it could terminate after some idle time (as a consequence - APScheduler's tasks won't execute)?
Thank you!
You're right -- the scheduler won't start until the first request comes in.
Therefore running a scheduler in a WSGI worker is not a good idea. A better idea would be to run the scheduler in a separate process and connect to the scheduler when necessary via some RPC mechanism like RPyC or Execnet.
I have a unit-tests for my django project.
Some of views in my django project run celery tasks and I want to check database after these tasks.
I have a separated tests for the celery tasks, where I call them without .delay() method.
The main problem, what is the best and cleanest way to have a celery worker during the jenkins job?
Currently I just run nohup celery -A myqpp worker & before test and kill all running celery at the end of the job.
The best and cleanest way is not to have any celery workers during the Jenkins job, neither any queue/result backend. Utilize CELERY_ALWAYS_EAGER setting to execute your tasks in unit tests locally by blocking until the task returns.
Check out more in Celery documentation: CELERY_ALWAYS_EAGER docs
Just to extend answer about always eager mode, you can see my answer on other question, how you can run celery worker from test setUp https://stackoverflow.com/a/42107423/590233
But few tings need to be done there:
Connect celery worker to test db
Somehow run message broker instance ... (i think that you run it already before test, but cleanest way is to spawn broker instance from setUp as an celery worker)