I got celery project with RabbitMQ backend, that relies heavily on inspecting scheduled tasks. I found that the following code returns nothing for most of the time (of course, there are scheduled tasks) :
i = app.control.inspect()
scheduled = i.scheduled()
if (scheduled):
# do something
This code also runs from one of tasks, but I think it doesn't matter, I got same result from interactive python command line (with some exceptions, see below).
At the same time, celery -A <proj> inspect scheduled command never fails. Also, I noticed, that when called from interactive python command line for the first time, this command also never fails. Most of the successive i.scheduled() calls return nothing.
i.scheduled() guarantees result only when called for the first time?
If so, why and how then can I inspect scheduled tasks from task? Run dedicated worker and restart it after every task? Seems like overkill for such trivial task.
Please explain, how to use this feature the right way.
This is caused by some weird issue inside Celery app. To repeat methods from Inspect object you have to create new Celery app instance object.
Here is small snippet, which can help you:
from celery import Celery
def inspect(method):
app = Celery('app', broker='amqp://')
return getattr(app.control.inspect(), method)()
print inspect('scheduled')
print inspect('active')
Related
Problem
I have a Django + Celery app, and am trying to get a lengthy method to run in a Celery Task.
Goal
I want to
1) Put an entire line of execution - that calls across multiple classes & files - into a Celery Task. The application should continue serving users in the meantime.
2) Signal back when the Celery Task is done so I can update the UI.
Issue
Despite scouring docs, examples, and help sites I can't get this working.
I don't know which signals to use, how to register them, how to write the methods involved to receive those signals properly, and respond back when done.
It isn't clear to me from the Celery documentation how to accomplish all of this: there are many, many signals and decorators available but little clarification on which ones are appropriate for this sequence of events.
Help Requested
Either
A point in the direction of some resources that can help (better documentation, links to examples that fit my use case, etc...),
or
Some first-hand help with which signals are appropriate to use, where to use them, and why.
This is my first post on StackOverflow, so thanks in advance for your kindness and patience.
The desired execution path:
1) views.py: Get input via GET or POST
SomeMethod: Send Signal to staticmethod in TaskManager (housed in TaskManager.py) to start a task with arguments
2) TaskManager.py: Process Signals, Keep Track of Tasks Running
SignalProcessor: receive signal, pull arguments out of kwargs, call appropriate method in tasks.py.
TaskList: Make note of the task in a TaskList so I know it's running, and where.
3) tasks.py
#shared_task
HandleTheTask(arg1, arg2):
Call the appropriate sequence of methods in other files/classes in sequence, which ultimately writes to the database many times in many methods. These other methods are not inside tasks.py.
Send a Signal to TaskManager when the method in tasks.py completes the lengthy task.
4) TaskManager.py
SignalProcessor: receive task_complete signal, remove the appropriate task from the TaskList.
What I've already tried
Putting HandleTheTask(arg1, arg2) with a #task or #shared_task
decorator into tasks.py.
I've tried calling HandleTheTask.delay(arg1, arg2) from the SignalProcessor method.
Result: Nothing happens. The app continues to execute, but the lengthy task doesn't. Celery doesn't notice the call and never executes it.
This is probably because the method I want to run is in another process?
Putting an intermediary method into tasks.py, which calls back to
TaskManager.py to run the lengthy task.
Result: As above.
Looking up Celery Signals to properly signal across processes: This
looks like the right solution but the documentation is big on info,
small on guidance.
Resources I've Consulted
I've already looked a bunch of docs and help sites. Here are just a few of the resources I've consulted:
https://docs.celeryproject.org/en/stable/userguide/signals.html
https://medium.com/analytics-vidhya/integrating-django-signals-and-celery-cb2876ebd494
How will I know if celery background process is successful inside django code. If it is successful I want render a html page
Catch django signal sent from celery task
Django Signals in celery
Any help is welcome. Thank you!
There is a specific periodic task that needs to be removed from message queue. I am using the configuration of Redis and celery here.
tasks.py
#periodic_task(run_every=crontab(minute='*/6'))
def task_abcd():
"""
some operations here
"""
There are other periodic tasks also in the project but I need to stop this specific task to stop from now on.
As explained in this answer, the following code will work?
#periodic_task(run_every=crontab(minute='*/6'))
def task_abcd():
pass
In this example periodic task schedule is defined directly in code, meaning it is hard-coded and cannot be altered dynamically without code change and app re-deploy.
The provided code with task logic deleted or with simple return at the beginning - will work, but will not be the answer to the question - task will still run, there just is no code that will run with it.
Also, it is recommended NOT to use #periodic_task:
"""Deprecated decorator, please use :setting:beat_schedule."""
so it is not recommended to use it.
First, change method from being #periodic_task to just regular celery #task, and because you are using Django - it is better to go straightforward for #shared_task:
from celery import shared_task
#shared_task
def task_abcd():
...
Now this is just one of celery tasks, which needs to be called explicitly. Or it can be run periodically if added to celery beat schedule.
For production and if using multiple workers it is not recommended to run celery worker with embedded beat (-B) - run separate instance of celery beat scheduler.
Schedule can specified in celery.py or in django project settings (settings.py).
It is still not very dynamic, as to re-read settings app needs to be reloaded.
Then, use Database Scheduler which will allow dynamically creating schedules - which tasks need to be run and when and with what arguments. It even provides nice django admin web views for administration!
That code will work but I'd go for something that doesn't force you to update your code every time you need to disable/enable the task.
What you could do is to use a configurable variable whose value could come from an admin panel, a configuration file, or whatever you want, and use that to return before your code runs if the task is in disabled mode.
For instance:
#periodic_task(run_every=crontab(minute='*/6'))
def task_abcd():
config = load_config_for_task_abcd()
if not config.is_enabled:
return
# some operations here
In this way, even if your task is scheduled, its operations won't be executed.
If you simply want to remove the periodic task, have you tried to remove the function and then restart your celery service. You can restart your Redis service as well as your Django server for safe measure.
Make sure that the function you removed is not referenced anywhere else.
I found some related questions but nothing that describes exactly my problem. So I can inspect my queue with a code like that:
from celery.task.control import inspect
#inspect(['my_queue']), with a list instead of a str, should work!
i = inspect(['my_queue'])
print(i.active()) # get a list of active tasks
print(i.registered()) # get a list of tasks registered
print(i.scheduled()) # get a list of tasks waiting
print(i.reserved()) #tasks that has been received, but waiting to be executed
But somehow, for every second execution the method returns a empty task list. Sometimes I also get a Reset Connection Error. Some ideas why this happes? Is there something like a interval, where workers fill there active tasks list or something like that?
I assume the code you wrote above is not how the actual application looks (it can't work without a Celery object). - The only explanation is that you have some connectivity issues, otherwise it should work every time you run, unless genuinely there are no tasks to report. In other words - the cluster is idle.
Keep in mind that inspect broadcasts a message to all the workers and waits for their replies. If some of them times out for whatever reason(s), you will not see that worker in the output. If it happens that only that worker was busy, you may end up with an empty list of tasks.
Try to call something like celery -A yourproject.celeryapp status to see if your workers are responsive, and if everything is OK run your script. - It should work.
I've read Testing with Celery but I'm still a bit confused. I want to test code that generates a Celery task by running the task manually and explicitly, something like:
def test_something(self):
do_something_that_generates_a_celery_task()
assert_state_before_task_runs()
run_task()
assert_state_after_task_runs()
I don't want to entirely mock up the creation of the task but at the same time I don't care about testing the task being picked up by a Celery worker. I'm assuming Celery works.
The actual context in which I'm trying to do this is a Django application where there's some code that takes too long to run in a request, so, it's delegated to background jobs.
In test mode use CELERY_TASK_ALWAYS_EAGER = True. You can set this setting in your settings.py in django if you have followed the default guide for django-celery configuration.
I have a code which deletes an api column when executed. Now I want it to execute after some time lets say two weeks. Any idea or directions how do I implement it?
My code:
authtoken = models.UserApiToken.objects.get(api_token=token)
authtoken.delete()
This is inside a function and executed when a request is made.
There are two main ways to get this done:
Make it a custom management command, and trigger it through crontab.
Use celery, make it a celery task, and use celerybeat to trigger the job after 2 weeks.
I would recommend celery, as it provides a better control of your task queues and jobs.