Task state not updating when using custom state - python

I have a task like this:
#task
def test():
time.sleep(10)
test.update_state(state="PROGRESS")
time.sleep(10)
return "done"
I then run this:
>>> from celery.execute import send_task
>>> t = send_task("testcelery.test")
>>> t.state
'PENDING'
>>> t.state
'PROGRESS'
I can see in the worker that the task has completed:
[2011-02-19 20:18:43,851: INFO/MainProcess] Task testcelery.test[7598b170-2877-4d76-89a0-9bcc4c9f877e] succeeded in 20.0225799084s: 'done'
But t.state never changes from PROGRESS to SUCCESS. What am I doing wrong?

You should upgrade to Celery 2.2.4 (released yesterday) as it fixes the bug that causes this.
See http://celeryq.org/docs/changelog.html

It looks to me like CELERY_IGNORE_RESULT set would cause this behavior. What is t.ignore_result? If it is true then either change it or change the default. If you want to always inspect the result then changing CELERY_IGNORE_RESULT makes more sense to me. But then setting it on every task would make your intentions more obvious.

Related

How can I detect whether I'm running in a Celery worker?

Is there a way to determine, programatically, that the current module being imported/run is done so in the context of a celery worker?
We've settled on setting an environment variable before running the Celery worker, and checking this environment variable in the code, but I wonder if there's a better way?
Simple,
import sys
IN_CELERY_WORKER_PROCESS = sys.argv and sys.argv[0].endswith('celery')\
and 'worker' in sys.argv
if IN_CELERY_WORKER_PROCESS:
print ('Im in Celery worker')
http://percentl.com/blog/django-how-can-i-detect-whether-im-running-celery-worker/
As of celery 4.2 you can also do this by setting a flag on the worker_ready signal
in celery.py:
from celery.signals import worker_ready
app = Celery(...)
app.running = False
#worker_ready.connect
def set_running(*args, **kwargs):
app.running = True
Now you can check within your task by using the global app instance
to see whether or not you are running. This can be very useful to determine which logger to use.
Depending on what your use-case scenario is exactly, you may be able to detect it by checking whether the request id is set:
#app.task(bind=True)
def foo(self):
print self.request.id
If you invoke the above as foo.delay() then the task will be sent to a worker and self.request.id will be set to a unique number. If you invoke it as foo(), then it will be executed in your current process and self.request.id will be None.
You can use the current_worker_task property from the Celery application instance class. Docs here.
With the following task defined:
# whatever_app/tasks.py
celery_app = Celery(app)
#celery_app.task
def test_task():
if celery_app.current_worker_task:
return 'running in a celery worker'
return 'just running'
You can run the following on a python shell:
In [1]: from whatever_app.tasks import test_task
In [2]: test_task()
Out[2]: 'just running'
In [3]: r = test_task.delay()
In [4]: r.result
Out[4]: u'running in a celery worker'
Note: Obviously for test_task.delay() to succeed, you need to have at least one celery worker running and configured to load tasks from whatever_app.tasks.
Adding a environment variable is a good way to check if the module is being run by celery worker. In the task submitter process we may set the environment variable, to mark that it is not running in the context of a celery worker.
But the better way may be to use some celery signals which may help to know if the module is running in worker or task submitter. For example, worker-process-init signal is sent to each child task executor process (in preforked mode) and the handler can be used to set some global variable indicating it is a worker process.
It is a good practice to start workers with names, so that it becomes easier to manage(stop/kill/restart) them. You can use -n to name a worker.
celery worker -l info -A test -n foo
Now, in your script you can use app.control.inspect to see if that worker is running.
In [22]: import test
In [23]: i = test.app.control.inspect(['foo'])
In [24]: i.app.control.ping()
Out[24]: [{'celery#foo': {'ok': 'pong'}}]
You can read more about this in celery worker docs

Aliases for celery tasks

I am switching task naming scheme. There are parts of the code which still use old names, and some which use new names. So, my question is: what is the proper way of aliasing Celery tasks?
#task
def new_task_name():
pass
old_task_name = new_task_name # doesn't work
app.tasks['old_task_name'] = new_task_name # still doesn't work
I get error similar to this:
Received unregistered task of type 'app.tasks.old_task_name'
UPDATE:
My current solution is forwarding tasks. But I still hope there's a cleaner approach:
#task
def old_task_name():
new_task_name.delay()
#app.task(name='this-is-the-name')
def new_task_name():
pass
This question is ancient but a more direct way to do this is:
#task(name='old-name')
def old_task_name(*args, **kwargs):
return new_task_name(*args, **kwargs)
Celery tasks can still be called as normal methods too.

django celery: update_state not doing anything

I am writing a small test task for django-celery, in which I would like to set a custom state (and some data, but let's start with a custom state first).
I use django as the messaging backend. My version of python is 2.6.
Here's the content of tasks.py
import time
from djcelery import celery
#celery.task
def generate():
generate.update_state(state="PROGRESS")
time.sleep(10)
return True
And here's what happens when I give it a try:
>>> import tasks
>>> result = tasks.generate.delay()
>>> result
<AsyncResult: f72574aa-f8c5-49dc-89d4-47d2012a4d6d>
# status and state are the same, but just to make sure
>>> result.status
u'PENDING'
>>> result.state
u'PENDING'
>>> result.result
# empty, as in None
# wait a few seconds
>>> result.status
u'SUCCESS'
>>> result.state
u'SUCCESS'
>>> result.result
True
I can't figure out why the state is PENDING while it should be PROGRESS. Any idea?
I've already looked at the documentation, and here's the relevant link: http://docs.celeryproject.org/en/latest/userguide/tasks.html#custom-states
I do the exact same thing (minus the meta, but I also tried without success), so it should work.
UPDATE: I found out why, looks like you have to restart the celery daemon whenever you update your tasks so that the changes are taken into account.
I found out why, looks like you have to restart the celery daemon whenever you update your tasks so that the changes are taken into account.

Cancel an already executing task with Celery?

I have been reading the doc and searching but cannot seem to find a straight answer:
Can you cancel an already executing task? (as in the task has started, takes a while, and half way through it needs to be cancelled)
I found this from the doc at Celery FAQ
>>> result = add.apply_async(args=[2, 2], countdown=120)
>>> result.revoke()
But I am unclear if this will cancel queued tasks or if it will kill a running process on a worker. Thanks for any light you can shed!
revoke cancels the task execution. If a task is revoked, the workers ignore the task and do not execute it. If you don't use persistent revokes your task can be executed after worker's restart.
https://docs.celeryq.dev/en/stable/userguide/workers.html#worker-persistent-revokes
revoke has an terminate option which is False by default. If you need to kill the executing task you need to set terminate to True.
>>> from celery.task.control import revoke
>>> revoke(task_id, terminate=True)
https://docs.celeryq.dev/en/stable/userguide/workers.html#revoke-revoking-tasks
In Celery 3.1, the API of revoking tasks is changed.
According to the Celery FAQ, you should use result.revoke:
>>> result = add.apply_async(args=[2, 2], countdown=120)
>>> result.revoke()
or if you only have the task id:
>>> from proj.celery import app
>>> app.control.revoke(task_id)
#0x00mh's answer is correct, however recent celery docs say that using the terminate option is "a last resort for administrators" because you may accidentally terminate another task which started executing in the meantime. Possibly a better solution is combining terminate=True with signal='SIGUSR1' (which causes the SoftTimeLimitExceeded exception to be raised in the task).
Per the 5.2.3 documentation, the following command can be run:
celery.control.revoke(task_id, terminate=True, signal='SIGKILL')
where
celery = Celery(app.name, broker=app.config['CELERY_BROKER_URL'])
Link to the doc: https://docs.celeryq.dev/en/stable/reference/celery.app.control.html?highlight=revoke#celery.app.control.Control.revoke
In addition, unsatisfactory, there is another way(abort task) to stop the task, but there are many unreliability, more details, see:
http://docs.celeryproject.org/en/latest/reference/celery.contrib.abortable.html
You define celery app with broker and backend something like :
from celery import Celery
celeryapp = Celery('app', broker=redis_uri, backend=redis_uri)
When you run send task it return unique id for task:
task_id = celeryapp.send_task('run.send_email', queue = "demo")
To revoke task you need celery app and task id:
celeryapp.control.revoke(task_id, terminate=True)
from celery.app import default_app
revoked = default_app.control.revoke(task_id, terminated=True, signal='SIGKILL')
print(revoked)
See the following options for tasks: time_limit, soft_time_limit (or you can set it for workers). If you want to control not only time of execution, then see expires argument of apply_async method.

Is there something wrong with my Python code? (functions)

#tasks.py
from celery.decorators import task
#task()
def add(x, y):
add.delay(1, 9)
return x + y
>>> import tasks
>>> res = tasks.add.delay(5, 2)
>>> res.result()
7
If I run this code, I expect tasks to be continously added to the queue. But it's not! Only the first task (5,2) gets added to the queue and processed.
There should continuously be tasks being added, due to this line: "add.delay(1,9)"
Note: I need each task to execute another task.
As far as I can see, a periodic_task decorator is creating preiodic tasks, task creates just one task. And delay just executes it asynchronically.
You should just use periodic_task, instead of recursion.
add inside function body refers to original function, not its decorated version.
If you just need to run task repeatedly, use #periodic_task instead. You only need recursion if delay is different each time. In this case, subclass Task instead of using decorator and you'll be able to use recursion without a problem.
You should look at subtasks and callbacks, might give you the answer you are looking for
http://celeryproject.org/docs/userguide/tasksets.html

Categories

Resources