Rescheduling the task when celery worker node crashes

Rescheduling the task when celery worker node crashes - python

In celery the worker node which is executing the task crashes in a mid-way through the task. Does the celerey reschedules the task excution?

Not by default, however, you can set the acks_late option on the task, or the task_acks_late option globally, to get that behavior. See:
http://docs.celeryproject.org/en/latest/faq.html#should-i-use-retry-or-acks-late
The acks_late setting would be used when you need the task to be
executed again if the worker (for some reason) crashes mid-execution.

Related

resume running terminated task in celery upon worker restart

After reading docs; what I understand is that you cannot rerun celery tasks outside of application contexts.
Initially; what i thought was; terminated task would resume running once the worker has been restarted; however it didn't. I am currently using
celery.control.terminate(task_id)
That terminates my celery task id; I then tried running a worker with the same name hoping my revoked task would resume and finish; it didn't. After doing bit of research; I saw that a task can be reran with the same arguments; I thought MAYBE it would resume if I reran the same task again, it didn't. How can I revoke a task - then be able to re run it.
I'm using .apply_async() to intiate my task.

Use revoke instead of terminate e.g:
celery_app.control.revoke(task_id)
you can refer this solution as well.
Cancel an already executing task with Celery?

Celery worker details

I have celery task with 100 input data in queue and need to execute using 5 workers.
How can I get which worker is executing which input?
Each worker executed how many inputs and its status?
If any task is failed how can get failed input data in separately and re-execute with available worker?
Is there any possible ways to customize celery based on worker specific.
We can combine celery worker limitation and flower
I am not using any framework.

How can I get which worker is executing which input?
There are 2 options to use multiple workers:
You run each worker separately with separate run commands
You run in one command using command line option -c i.e. concurrency
First method, flower will support it and will show you all the workers, all the tasks (you call inputs), which worker processed which task and other information too.
With second method, flower will show you all the tasks being processed by single worker. In this case you can only differentiate by viewing logs generated by celery worker as in logs it does store which worker THREAD executed which task. So, i think you will be better using first option given your requirements.
Each worker executed how many inputs and its status?
As I mentioned, using first approach, flower will give you this information.
If any task is failed how can get failed input data in separately and
re-execute with available worker?
Flower does provide the filters to filter the failed tasks and does provide what status tasks returned when exiting. Also you can set how many times celery should retry a failed task. But even after retries task fails, then you will have to relaunch the task yourself.

For the first and second question:
1) Using Flower API:
You can use celery flower to keep track of it. Flower api can provide you the information like which task is being executed by which worker through simple api calls (/api/task/info/<task_id>)
2) Querying celery directly:
from celery import Celery
celery = Celery('vwadaptor', broker='redis://workerdb:6379/0',backend='redis://workerdb:6379/0')
celery.control.inspect().active()
3) Using celery events:
Link : http://docs.celeryproject.org/en/latest/userguide/monitoring.html
(look Real-time Processing)
You can create an event ( task created, task received, etc) and the response will have the worker name(hostname , see the link)
For the third question:
Use the config entry 'CELERY_ACKS_LATE=True' to retry failed tasks.
celery.conf.update(
CELERY_ACKS_LATE=True,
)
You can also track failed tasks using celery events mentioned above and retry failed tasks manually.

Celery multi with queues set up not receiving tasks from django

I am running my workers with the following command:
celery -A myapp multi start 4 -l debug -Q1:3 queue1,queue2 -Q:4 queue3
The workers start out very well so when i run
celery inspect active_queues
the queues appear assigned.
Then i start tasks from my django app with the following code:
result = chain(task1.s(**kwargs).set(queue='queue1'),task2.s(**kwargs).set(queue='queue2'))()
i parse the result variable with result.parent to get all tasks IDs and record them to database for further inspection. When i issue
task = AsyncResult(task.id)
task.status
i get
'PENDING'
for every task i start with my chain. The celery logs doesn't seem to be receiving any tasks. However when i issue a
celery purge
command with a following
yes
i get message that my tasks has been actually removed from 1 queue
the AsyncResult.status on the deleted tasks from here on continue to show up as 'PENDING' and the tasks never start.
I use rabbitmq-server as a broker with all default configuration. My celery config is default. It is really strange but in another environment the same code and commands produce other results: The workers also start but they do receive the very same tasks and execute them without any issues. Please consider what might be an issue here.
p.s. when i start a worker the other way:
celery -A myapp worker -Q queue1,queue2,queue3 -l debug
i still cant get my tasks executing.
The problem started to show up when i modified my chain to launch tasks and added the
.set(queue='queue1')
or queue2 or queue3
p.p.s:
all my tasks are written with a
#shared_task
decorator
Is there at least a way to see which tasks (which i can remove by celery purge) are waiting on a queue and what is the queue name they are waiting for?

Celery default settings should cover your case, so only thing could be that you have defined some of the following option in a way that mute your queues, and in this case consider commenting them out (more in the docs):
CELERY_QUEUES
CELERY_ROUTES
CELERY_DEFAULT_EXCHANGE
CELERY_DEFAULT_ROUTING_KEY
CELERY_DEFAULT_ROUTING_KEY
As for your question, I guess that's not the full answer, but you can list all active queues from RabbitMQ.
Using Celery, from the doc:
celery -A proj inspect active
Using RabbitMQ, from the doc:
rabbitmqadmin list queues vhost name node messages message_stats.publish_details.rate

Notify celery task to stop during worker shutdown

I'm using celery 3.X and RabbitMQ backend. From time to time it needs to restart celery (to push a new source code update to the server). But there is a task with big loop and try/catch inside of the loop; it can takes a few hours to accomplish the task. Nothing critical will happen if I will stop it and will restart it later.
QUESTION: The problem is every time after I stopped the workers (via sudo service celeryd stop) I have to KILL the task manually (via kill -9); the task ignores SIGTERM from worker. I've read throw Celery docs & Stackoverflow but I can't find working solution. Any ideas how to fix the problem?

Sending the QUIT signal will stop workers immediately: sudo service celeryd stop -QUIT
If the CELERY_ACKS_LATE setting is set to True, tasks that were running when the worker stopped will run again when the worker starts back up.

Celery is not intended to run long tasks cause it blocks the worker for your task only. I recommend re-arranging your logic, making the task invoke itself instead of making the loop. Once shutdown is in progress, your current task will complete and will resume right at the same point where it stopped before celery shutdown.
Also, having task split into chunks, you will be able to divert the task to another worker/host which is probably what you would like to do in the future.

Retry a task in celery by task_id

I've launched a lot of tasks, but some of then hasn't finished (763 tasks), are in PENDING state, but the system isn't processing anything...
It's possible to retry this tasks giving celery the task_id?

You can't.
You can retry a task only from inside itself, you can't do it from outside.
The best thing to do in this case is to run again the task type with the same args, in this way you will do the same JOB but with a new PID that identify your process/task.
Remember also that the celery PENDING state not means only that the task is waiting for execution, but maybe that is unknown.
http://celeryq.org/docs/userguide/tasks.html#pending
I hope this could help

This works now after setting celery.conf.update(result_extended=True) which persists the arguments passed to the task:
def retry_task(task_id):
meta=celery.backend.get_task_meta(task_id)
task = celery.tasks[meta['name']]
task.apply_async(args=meta['args'], kwargs=meta['kwargs']) #specify any other parameters you might be passing

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.