I have been reading the doc and searching but cannot seem to find a straight answer:
Can you cancel an already executing task? (as in the task has started, takes a while, and half way through it needs to be cancelled)
I found this from the doc at Celery FAQ
>>> result = add.apply_async(args=[2, 2], countdown=120)
>>> result.revoke()
But I am unclear if this will cancel queued tasks or if it will kill a running process on a worker. Thanks for any light you can shed!
revoke cancels the task execution. If a task is revoked, the workers ignore the task and do not execute it. If you don't use persistent revokes your task can be executed after worker's restart.
https://docs.celeryq.dev/en/stable/userguide/workers.html#worker-persistent-revokes
revoke has an terminate option which is False by default. If you need to kill the executing task you need to set terminate to True.
>>> from celery.task.control import revoke
>>> revoke(task_id, terminate=True)
https://docs.celeryq.dev/en/stable/userguide/workers.html#revoke-revoking-tasks
In Celery 3.1, the API of revoking tasks is changed.
According to the Celery FAQ, you should use result.revoke:
>>> result = add.apply_async(args=[2, 2], countdown=120)
>>> result.revoke()
or if you only have the task id:
>>> from proj.celery import app
>>> app.control.revoke(task_id)
#0x00mh's answer is correct, however recent celery docs say that using the terminate option is "a last resort for administrators" because you may accidentally terminate another task which started executing in the meantime. Possibly a better solution is combining terminate=True with signal='SIGUSR1' (which causes the SoftTimeLimitExceeded exception to be raised in the task).
Per the 5.2.3 documentation, the following command can be run:
celery.control.revoke(task_id, terminate=True, signal='SIGKILL')
where
celery = Celery(app.name, broker=app.config['CELERY_BROKER_URL'])
Link to the doc: https://docs.celeryq.dev/en/stable/reference/celery.app.control.html?highlight=revoke#celery.app.control.Control.revoke
In addition, unsatisfactory, there is another way(abort task) to stop the task, but there are many unreliability, more details, see:
http://docs.celeryproject.org/en/latest/reference/celery.contrib.abortable.html
You define celery app with broker and backend something like :
from celery import Celery
celeryapp = Celery('app', broker=redis_uri, backend=redis_uri)
When you run send task it return unique id for task:
task_id = celeryapp.send_task('run.send_email', queue = "demo")
To revoke task you need celery app and task id:
celeryapp.control.revoke(task_id, terminate=True)
from celery.app import default_app
revoked = default_app.control.revoke(task_id, terminated=True, signal='SIGKILL')
print(revoked)
See the following options for tasks: time_limit, soft_time_limit (or you can set it for workers). If you want to control not only time of execution, then see expires argument of apply_async method.
Related
I think I must be missing something. On my django server I do the following:
inspector = app.control.inspect()
for element in list:
result = app.send_task("workerTasks.do_processing_task", args=[element, "spd"], queue="cloud")
scheduled = inspector.scheduled()
reserved = inspector.reserved()
active = inspector.active()
print("-- SCHEDULED")
print(scheduled)
print("-- RESERVED")
print(reserved)
print("-- ACTIVE")
print(active)
global_helper.execute_commands(["sudo rabbitmqctl list_queues"])
global_helper.execute_commands(["celery inspect active_queues"])
(excuse the printing, this is remote and I haven't set up remote debugging)
There are no workers connected to the server, and I get this output:
-- SCHEDULED
None
-- RESERVED
None
-- ACTIVE
None
[ sudo rabbitmqctl list_queues ]
Listing queues ...
[ celery inspect active_queues ]
Error: No nodes replied within time constraint.
So it looks like all of these tools require jobs to be on a worker somewhere.
So my question is, how can I look at tasks which are still waiting to be picked up by a worker?
(I have verified that app.send_task has been called 127 times)
Celery does not provide this facility (probably there is a good reason for it) so you need to use your brokers facilities to find out what you want to know. Here is how for Redis and RabbitMQ.
I am using Celery to asynchronously perform a group of operations. There are a lot of these operations and each may take a long time, so rather than send the results back in the return value of the Celery worker function, I'd like to send them back one at a time as custom state updates. That way the caller can implement a progress bar with a change state callback, and the return value of the worker function can be of constant size rather than linear in the number of operations.
Here is a simple example in which I use the Celery worker function add_pairs_of_numbers to add a list of pairs of numbers, sending back a custom status update for every added pair.
#!/usr/bin/env python
"""
Run worker with:
celery -A tasks worker --loglevel=info
"""
from celery import Celery
app = Celery("tasks", broker="pyamqp://guest#localhost//", backend="rpc://")
#app.task(bind=True)
def add_pairs_of_numbers(self, pairs):
for x, y in pairs:
self.update_state(state="SUM", meta={"x":x, "y":y, "x+y":x+y})
return len(pairs)
def handle_message(message):
if message["status"] == "SUM":
x = message["result"]["x"]
y = message["result"]["y"]
print(f"Message: {x} + {y} = {x+y}")
def non_looping(*pairs):
task = add_pairs_of_numbers.delay(pairs)
result = task.get(on_message=handle_message)
print(result)
def looping(*pairs):
task = add_pairs_of_numbers.delay(pairs)
print(task)
while True:
pass
if __name__ == "__main__":
import sys
if sys.argv[1:] and sys.argv[1] == "looping":
looping((3,4), (2,7), (5,5))
else:
non_looping((3,4), (2,7), (5,5))
If you run just ./tasks it executes the non_looping function. This does the standard Celery thing: makes a delayed call to the worker function and then uses get to wait for the result. A handle_message callback function prints each message, and the number of pairs added is returned as the result. This is what I want.
$ ./task.py
Message: 3 + 4 = 7
Message: 2 + 7 = 9
Message: 5 + 5 = 10
3
Though the non-looping scenario is sufficient for this simple example, the real world task I'm trying to accomplish is processing a batch of files instead of adding pairs of numbers. Furthermore the client is a Flask REST API and therefore cannot contain any blocking get calls. In the script above I simulate this constraint with the looping function. This function starts the asynchronous Celery task, but does not wait for a response. (The infinite while loop that follows simulates the web server continuing to run and handle other requests.)
If you run the script with the argument "looping" it runs this code path. Here it immediately prints the Celery task ID then drops into the infinite loop.
$ ./tasks.py looping
a39c54d3-2946-4f4e-a465-4cc3adc6cbe5
The Celery worker logs show that the add operations are performed, but the caller doesn't define a callback function, so it never gets the results.
(I realize that this particular example is embarrassingly parallel, so I could use chunks to divide this up into multiple tasks. However, in my non-simplified real-world case I have tasks that cannot be parallelized.)
What I want is to be able to specify a callback in the looping scenario. Something like this.
def looping(*pairs):
task = add_pairs_of_numbers.delay(pairs, callback=handle_message) # There is no such callback.
print(task)
while True:
pass
In the Celery documentation and all the examples I can find online (for example this), there is no way to define a callback function as part of the delay call or its apply_async equivalent. You can only specify one as part of a get callback. That's making me think this is an intentional design decision.
In my REST API scenario I can work around this by having the Celery worker process send a "status update" back to the Flask server in the form of an HTTP post, but this seems weird because I'm starting to replicate messaging logic in HTTP that already exists in Celery.
Is there any way to write my looping scenario so that the caller receives callbacks without making a blocking call, or is that explicitly forbidden in Celery?
It's a pattern that is not supported by celery although you can (somewhat) trick it out by posting custom state updates to your task as described here.
Use update_state() to update a task’s state:.
def upload_files(self, filenames):
for i, file in enumerate(filenames):
if not self.request.called_directly:
self.update_state(state='PROGRESS',
meta={'current': i, 'total': len(filenames)})```
The reason that celery does not support such a pattern is that task producers (callers) are strongly decoupled from the task consumers (workers) with the only communications between the two being the broker to support communication from producers to consumers and the result backend supporting communications from consumers to producers. The closest you can get currently is with polling a task state or writing a custom result backend that will allow you to post events either via AMP RPC or redis subscriptions.
My objective is to schedule an Azure Batch Task to run every 5 minutes from the moment it has been added, and I use the Python SDK to create/manage my Azure resources. I tried creating a Job-Schedule and it automatically created a new Job under the specified Pool.
job_spec = batch.models.JobSpecification(
pool_info=batch.models.PoolInformation(pool_id=pool_id)
)
schedule = batch.models.Schedule(
start_window=datetime.timedelta(hours=1),
recurrence_interval=datetime.timedelta(minutes=5)
)
setup = batch.models.JobScheduleAddParameter(
'python_test_schedule',
schedule,
job_spec
)
batch_client.job_schedule.add(setup)
What I did is then add a task to this new Job. But the task seems to run only once as soon as it is added (like a normal task). Is there something more that I need to do to make the task run recurrently? There doesn't seem to be much documentation and examples of JobSchedule either.
Thank you! Any help is appreciated.
You are correct in that a JobSchedule will create a new job at the specified time interval. Additionally, you cannot have a task "re-run" every 5 minutes once it has completed. You could do either:
Have one task that runs a loop, performing the same action every 5 minutes.
Use a Job Manager to add a new task (that does the same thing) every 5 minutes.
I would probably recommend the 2nd option, as it has a little more flexibility to monitor the progress of the tasks and job and take actions accordingly.
An example client which creates the job might look a bit like this:
job_manager = models.JobManagerTask(
id='job_manager',
command_line="/bin/bash -c 'python ./job_manager.py'",
environment_settings=[
mdoels.EnvironmentSettings('AZ_BATCH_KEY', AZ_BATCH_KEY)],
resource_files=[
models.ResourceFile(blob_sas="https://url/to/job_manager.py", file_name="job_manager.py")],
authentication_token_settings=models.AuthenticationTokenSettings(
access=[models.AccessScope.job]),
kill_job_on_completion=True, # This will mark the job as complete once the Job Manager has finished.
run_exclusive=False) # Whether the job manager needs a dedicated VM - this will depend on the nature of the other tasks running on the VM.
new_job = models.JobAddParameter(
id='my_job',
job_manager_task=job_manager,
pool_info=models.PoolInformation(pool_id='my_pool'))
batch_client.job.add(new_job)
Now we need a script to run as the Job Manager on the compute node. In this case I will use Python, so you will need to add a StartTask to you pool (or JobPrepTask to the job) to install the azure-batch Python package.
Additionally the Job Manager Task will need to be able to authenticate against the Batch API. There are two methods of doing this depending on the scope of activities that the Job Manager will perform. If you only need to add tasks, then you can use the authentication_token_settings attribute, which will add an AAD token environment variable to the Job Manager task with permissions to ONLY access the current job. If you need permission to do other things, like alter the pool, or start new jobs, you can pass an account key via environment variable. Both options are shown above.
The script you run on the Job Manager task could look something like this:
import os
import time
from azure.batch import BatchServiceClient
from azure.batch.batch_auth import SharedKeyCredentials
from azure.batch import models
# Batch account credentials
AZ_BATCH_ACCOUNT = os.environ['AZ_BATCH_ACCOUNT_NAME']
AZ_BATCH_KEY = os.environ['AZ_BATCH_KEY']
AZ_BATCH_ENDPOINT = os.environ['AZ_BATCH_ENDPOINT']
# If you're using the authentication_token_settings for authentication
# you can use the AAD token in the environment variable AZ_BATCH_AUTHENTICATION_TOKEN.
def main():
# Batch Client
creds = SharedKeyCredentials(AZ_BATCH_ACCOUNT, AZ_BATCH_KEY)
batch_client = BatchServiceClient(creds, base_url=AZ_BATCH_ENDPOINT)
# You can set up the conditions under which your Job Manager will continue to add tasks here.
# It could be a timeout, max number of tasks, or you could monitor tasks to act on task status
condition = True
task_id = 0
task_params = {
"command_line": "/bin/bash -c 'echo hello world'",
# Any other task parameters go here.
}
while condition:
new_task = models.TaskAddParameter(id=task_id, **task_params)
batch_client.task.add(AZ_JOB, new_task)
task_id += 1
# Perform any additional log here - for example:
# - Check the status of the tasks, e.g. stdout, exit code etc
# - Process any output files for the tasks
# - Delete any completed tasks
# - Error handling for tasks that have failed
time.sleep(300) # Wait for 5 minutes (300 seconds)
# Job Manager task has completed - it will now exit and the job will be marked as complete.
if __name__ == '__main__':
main()
job_spec = batchmodels.JobSpecification(
pool_info=pool_info,
job_manager_task=batchmodels.JobManagerTask(
id="JobManagerTask",
#specify the command that needs to run recurrently
command_line="/bin/bash -c \" python3 task.py\""
))
Add the task that you want run recurrently as a JobManagerTask inside JobSpecification as shown above. Now this JobManagerTask will run recurrently.
Is there a way to determine, programatically, that the current module being imported/run is done so in the context of a celery worker?
We've settled on setting an environment variable before running the Celery worker, and checking this environment variable in the code, but I wonder if there's a better way?
Simple,
import sys
IN_CELERY_WORKER_PROCESS = sys.argv and sys.argv[0].endswith('celery')\
and 'worker' in sys.argv
if IN_CELERY_WORKER_PROCESS:
print ('Im in Celery worker')
http://percentl.com/blog/django-how-can-i-detect-whether-im-running-celery-worker/
As of celery 4.2 you can also do this by setting a flag on the worker_ready signal
in celery.py:
from celery.signals import worker_ready
app = Celery(...)
app.running = False
#worker_ready.connect
def set_running(*args, **kwargs):
app.running = True
Now you can check within your task by using the global app instance
to see whether or not you are running. This can be very useful to determine which logger to use.
Depending on what your use-case scenario is exactly, you may be able to detect it by checking whether the request id is set:
#app.task(bind=True)
def foo(self):
print self.request.id
If you invoke the above as foo.delay() then the task will be sent to a worker and self.request.id will be set to a unique number. If you invoke it as foo(), then it will be executed in your current process and self.request.id will be None.
You can use the current_worker_task property from the Celery application instance class. Docs here.
With the following task defined:
# whatever_app/tasks.py
celery_app = Celery(app)
#celery_app.task
def test_task():
if celery_app.current_worker_task:
return 'running in a celery worker'
return 'just running'
You can run the following on a python shell:
In [1]: from whatever_app.tasks import test_task
In [2]: test_task()
Out[2]: 'just running'
In [3]: r = test_task.delay()
In [4]: r.result
Out[4]: u'running in a celery worker'
Note: Obviously for test_task.delay() to succeed, you need to have at least one celery worker running and configured to load tasks from whatever_app.tasks.
Adding a environment variable is a good way to check if the module is being run by celery worker. In the task submitter process we may set the environment variable, to mark that it is not running in the context of a celery worker.
But the better way may be to use some celery signals which may help to know if the module is running in worker or task submitter. For example, worker-process-init signal is sent to each child task executor process (in preforked mode) and the handler can be used to set some global variable indicating it is a worker process.
It is a good practice to start workers with names, so that it becomes easier to manage(stop/kill/restart) them. You can use -n to name a worker.
celery worker -l info -A test -n foo
Now, in your script you can use app.control.inspect to see if that worker is running.
In [22]: import test
In [23]: i = test.app.control.inspect(['foo'])
In [24]: i.app.control.ping()
Out[24]: [{'celery#foo': {'ok': 'pong'}}]
You can read more about this in celery worker docs
I am writing a small test task for django-celery, in which I would like to set a custom state (and some data, but let's start with a custom state first).
I use django as the messaging backend. My version of python is 2.6.
Here's the content of tasks.py
import time
from djcelery import celery
#celery.task
def generate():
generate.update_state(state="PROGRESS")
time.sleep(10)
return True
And here's what happens when I give it a try:
>>> import tasks
>>> result = tasks.generate.delay()
>>> result
<AsyncResult: f72574aa-f8c5-49dc-89d4-47d2012a4d6d>
# status and state are the same, but just to make sure
>>> result.status
u'PENDING'
>>> result.state
u'PENDING'
>>> result.result
# empty, as in None
# wait a few seconds
>>> result.status
u'SUCCESS'
>>> result.state
u'SUCCESS'
>>> result.result
True
I can't figure out why the state is PENDING while it should be PROGRESS. Any idea?
I've already looked at the documentation, and here's the relevant link: http://docs.celeryproject.org/en/latest/userguide/tasks.html#custom-states
I do the exact same thing (minus the meta, but I also tried without success), so it should work.
UPDATE: I found out why, looks like you have to restart the celery daemon whenever you update your tasks so that the changes are taken into account.
I found out why, looks like you have to restart the celery daemon whenever you update your tasks so that the changes are taken into account.