I'm using celery 3.X and RabbitMQ backend. From time to time it needs to restart celery (to push a new source code update to the server). But there is a task with big loop and try/catch inside of the loop; it can takes a few hours to accomplish the task. Nothing critical will happen if I will stop it and will restart it later.
QUESTION: The problem is every time after I stopped the workers (via sudo service celeryd stop) I have to KILL the task manually (via kill -9); the task ignores SIGTERM from worker. I've read throw Celery docs & Stackoverflow but I can't find working solution. Any ideas how to fix the problem?
Sending the QUIT signal will stop workers immediately: sudo service celeryd stop -QUIT
If the CELERY_ACKS_LATE setting is set to True, tasks that were running when the worker stopped will run again when the worker starts back up.
Celery is not intended to run long tasks cause it blocks the worker for your task only. I recommend re-arranging your logic, making the task invoke itself instead of making the loop. Once shutdown is in progress, your current task will complete and will resume right at the same point where it stopped before celery shutdown.
Also, having task split into chunks, you will be able to divert the task to another worker/host which is probably what you would like to do in the future.
Related
Currently i am running task in celery which take 10 to 15 minutes to complete but problem is How do i restart task which currently running in worker and also one which is currently not running but waiting for task to run while i forcefully stopped worker or in case my server got crash or stopped. What happens right now is if i start celery again it's not starting the last running task or remaining task.
one thing you can do is enable acks_late on the task. Additionally, it's probably worthwhile to read their FAQ section on acks late and retry.
#app.task(acks_late=True)
def task(*args, **kwargs):
...
I have three Celery workers as follows, each running on a different ECS node:
Producer: Keeps generating & sending tasks to the consumer worker. Each task is expected to take several minutes to compute and has a database record.
Consumer: Receives computation tasks and immediately starts execution.
Watchdog: Periodically inspects database records, finds out computation tasks that are executing, and then does celery inspect active to verify whether there is actually a worker carrying out the computation.
We ensured that when the Consumer node is being terminated, the Celery worker on it will begin graceful shutdown, so that the ongoing computation can finish normally. Because Celery will unregister a gracefully stopping worker, the consumer will become invisible to the Watchdog, who will mistakenly think a computation task has mysteriously lost... even though the Consumer is still working on the task.
Is it possible to let a Celery worker broadcast an "I am dying" message upon receiving a warm shutdown signal? Or even better, can we somehow let the Watchdog worker still see shutting workers?
Yes, it is possible. Nodes in Celery cluster I am responsible for are doing something similar. Here is a snippet:
#worker_shutdown.connect
def handle_worker_shutdown(**kwargs):
_handle_worker_shutdown(app, _LOGGER, **kwargs)
#worker_ready.connect
def handle_worker_ready(**kwargs):
_handle_worker_ready(app, _LOGGER, **kwargs)
There are few other, very useful signals that you should have a look, but these two are essential. Maybe the worker_shutting_down is more suitable for your use-case...
I have a task that I run periodically (each minute) via Celery Beat. On occasions, the task will take longer than a minute to finish it's execution, which results in the scheduler adding that task to the queue while the task is already running.
Is there a way I can avoid the scheduler adding tasks to the queue if those tasks are already running?
Edit: I have seen Celery Beat: Limit to single task instance at a time
Note that my question is different. I'm asking how to avoid my task being enqueued, while that question is asking how to avoid the task being ran multiple times.
I haven't had this particular problem but a similar one where I had to avoid tasks being applied when a task of the same kind was already running or queued but without Celery Beat. I went down a similar route, with a locking mechanism, as the answer you've linked here. Unfortunately it won't be that easy here as you want to avoid to queue already.
As far as I know Celery doesn't support anything like this out of the box. I guess your best bet is to write a custom scheduler which inherits from Scheduler and then overwrite the apply_entry method or the apply_async method. In there you'd need a locking mechanism to check if the task is already running, i.e. in the task set and release a lock and in apply_async check for that lock. You could use RedLock if you have a Redis running already.
After reading docs; what I understand is that you cannot rerun celery tasks outside of application contexts.
Initially; what i thought was; terminated task would resume running once the worker has been restarted; however it didn't. I am currently using
celery.control.terminate(task_id)
That terminates my celery task id; I then tried running a worker with the same name hoping my revoked task would resume and finish; it didn't. After doing bit of research; I saw that a task can be reran with the same arguments; I thought MAYBE it would resume if I reran the same task again, it didn't. How can I revoke a task - then be able to re run it.
I'm using .apply_async() to intiate my task.
Use revoke instead of terminate e.g:
celery_app.control.revoke(task_id)
you can refer this solution as well.
Cancel an already executing task with Celery?
I'm getting started with celery and I want to know if it is possible to add modules to celeryd processes that have already been started. In other words, instead of adding modules via celeryconfig.py as in
CELERY_IMPORTS = ("tasks", "additional_module" )
before starting the workers, I want to make additional_module available later somehow after the worker processes have started.
thanks in advance.
You can achieve your goal by starting a new celeryd with an expanded import list and eventually gracefully shutting down your old worker (after it's finished its current jobs).
Because of the asynchronous nature of getting jobs pushed to you and only marking them done after celery has finished its work, you won't actually miss any work doing it this way. You should be able to run the celery workers on the same machine - they'll simply show up as new connections to RabbitMQ (or whatever queue backend you use).