i have celery running on few computers and using flower for monitoring.
the computers is used by different people.
celery beat is generating jobs for all the workers from one of the computer.
every time new coded task is ready, all the workers less the beat-computer will have task not registered exception.
what is the recommended direction to sync all the code to all other computers in the network, is there a prehook kind of mechanism in celery to check for new code?
Unfortunately, you need to update the code on all the workers (nodes) and after that you need to restart all of them. This is by (good) design.
A clever systemd service could in theory be able to
send the graceful shutdown signal
run pip install -U your-project
start the Celery service
Related
This post is in continuation with my previous post - celery how to implement single queue with multiple workers executing in parallel?
I implemented celery to work with eventlet using this command :-
celery -A project worker -P eventlet -l info --concurrency=4
I can see that my tasks are getting moved to active list faster (In flower) but i am not sure if they are executing in parallel? I have a 4 core server for production but I am not utilizing all the cores at the same time.
My question is :-
how can I use all 4 cores to execute tasks in parallel?
Both eventlet/gevent worker types provide great solution for concurrency at the cost of stalling parallelism to 1. To have true parallel task execution and utilise cores, run several Celery instances on same machine.
I know this goes counter to what popular Linux distros have in mind, so just ignore system packages and roll your great configuration from scratch. Systemd service template is your friend.
Another option is to run Celery with prefork pool, you get parallelism at the cost of stalling concurrency to number of workers.
I'm new to RQ and am trying to use it for a job which will run in the background. I have managed to set it up, and I'm also able to start more than one worker.
Now I'm trying to run these workers concurrently. I installed supervisor and followed a tutorial to add programs to it, and it worked.
Here is my supervisor configuration:
[program:rqworker]
command=/usr/local/bin/rq worker mysql
process_name=rqworker1-%(process_num)s
numprocs=3
directory=/home/hp/Python/sample
stopsignal=TERM
autostart=true
autorestart=true
stdout_logfile=/home/hp/Python/sample/logs
The worker function is present in the sample directory mentioned above.
The problem is that even after specifying numprocs as 3 in the config file, the workers do not run in parallel.
Here are some screenshots, which show that although multiple workers have been started, they do not work in parallel.
Also, I saw this stackoverflow answer, but it still doesn't divide the jobs amongst the workers!
Could anyone tell me what is wrong with this configuration/what I need to change?
I found the problem; it wasn't with supervisor or rqworker. The manager program was blocking concurrency, by waiting for task completion!
I'm trying to implement a simple scheduler on the machine that I share with my colleague. The idea is to run a process in the background as a server, identified by its pid. I can submit the program I want to run to this server process, say in a different bash terminal, and let the server process schedule the job regarding the availability of the hardware resource. The submit program should be able to lock some content in memory and communicate with the server.
I was trying to use python multiprocessing or subprocess module to do the above thing. But I didn't have a clear idea how this should be done. Any help would be appreciated.
I think cron job or celery would be a better choice for you use case. Just create the celery task and delay its execution according to your needs.
I would like to run APScheduler which is a part of WSGI (via Apache's modwsgi with 3 workers) webapp. I am new in WSGI world thus I would appreciate if you could resolve my doubts:
If APScheduler is a part of webapp - it becomes alive just after first request (first after start/reset Apache) which is run at least by one worker? Starting/resetting Apache won't start it - at least one request is needed.
What about concurrent requests - would every worker run same set of APScheduler's tasks or there will be only one set shared between all workers?
Would once running process (webapp run via worker) keep alive (so APScheduler's tasks will execute) or it could terminate after some idle time (as a consequence - APScheduler's tasks won't execute)?
Thank you!
You're right -- the scheduler won't start until the first request comes in.
Therefore running a scheduler in a WSGI worker is not a good idea. A better idea would be to run the scheduler in a separate process and connect to the scheduler when necessary via some RPC mechanism like RPyC or Execnet.
I started to use celery flower for tasks monitoring and it is working like a charm. I have one concern though, how can i "reload" info about monitored tasks after flower restart ?
I use redis as a broker, and i need to have option to check on tasks even in case of unexpected restart of service (or server).
Thanks in advance
I found i out.
It is the matter of setting the persistant flag in command running celery flower.