Celery beat queue includes obsolete tasks - python

I'm using periodic celery tasks with Django. I used to have the following task in my app/tasks.py file:
#periodic_task(run_every=timedelta(minutes=2))
def stuff():
...
But now this task has been removed from my app/tasks.py file. However, I keep seeing call to this task in my celery logs:
[2013-05-21 07:08:37,963: ERROR/MainProcess] Received unregistered task of type u'app.tasks.stuff'.
It seems that the celery beat scheduler that I use does not update its queue. This is how the scheduler is defined in my project/settings.py file:
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
Restarting the celery worker does not help. FYI, I use a Redis broker.
How can I either clear or update the celery beat queue so that older tasks are not sent to my celery worker?

Install django-celery.
As cited, this project is not needed to use celery but yet you need this to enable the admin interface at /admin/djcelery/ for managing periodic tasks. Initially there won't be no registered or periodic tasks.
Restart the beat and check the table Periodic tasks again. Beat would have added the existing scheduled tasks into that table with the interval or crontab defined in the settings or the decorators. There you can delete the unwanted tasks.
UPDATE: From celery4, it's recommended to use this package. https://github.com/celery/django-celery-beat

Delete the .pyc file for where the task was originally written. Or, just delete all .pyc files in your projects directory.
This command should work:
find . -name "*.pyc" -exec rm -rf {} \;
How do I remove all .pyc files from a project?

Related

How to run django unit test in jenkins, that depend on celery

I have a unit-tests for my django project.
Some of views in my django project run celery tasks and I want to check database after these tasks.
I have a separated tests for the celery tasks, where I call them without .delay() method.
The main problem, what is the best and cleanest way to have a celery worker during the jenkins job?
Currently I just run nohup celery -A myqpp worker & before test and kill all running celery at the end of the job.
The best and cleanest way is not to have any celery workers during the Jenkins job, neither any queue/result backend. Utilize CELERY_ALWAYS_EAGER setting to execute your tasks in unit tests locally by blocking until the task returns.
Check out more in Celery documentation: CELERY_ALWAYS_EAGER docs
Just to extend answer about always eager mode, you can see my answer on other question, how you can run celery worker from test setUp https://stackoverflow.com/a/42107423/590233
But few tings need to be done there:
Connect celery worker to test db
Somehow run message broker instance ... (i think that you run it already before test, but cleanest way is to spawn broker instance from setUp as an celery worker)

celery worker node and celery beat startup configuration

I have a celery worker that executes a bunch of tasks for data loading. I start up my work node using the following:
celery -A ingest_tasks worker -Q ingest --loglevel=INFO --concurrency=3 -f /var/log/celery/ingest_tasks.log
I have another application I want to setup as a celery beat to periodically grab files from various locations. Since my second application is not under the worker node I should be ok just starting the beat like this correct?
celery -A file_mover beat -s /my/path/celerybeat-schedule
I have never used celery beats before. From reading the docs they seem pretty straightforward. However I wanted to make sure this is correct.

Deploy django project with celery

After I deploy my django project, all I need is touch uwsgi_touch file. And uwsgi will gracefully restart its workers. But what about celery? Now I just restart celery manually when code base of celery tasks is changed. But even if I do it manually I still can't be sure that I will not kill celery task.
Any solutions?
A better way to manage celery workers is to use supervisor
$ pip install supervisor
$ cd /path/to/your/project
$ echo_supervisord_conf > supervisord.conf
Add these to your supervisord.conf file
[program:celeryworker]
command=/path/to/celery worker -A yourapp -l info
stdout_logfile=/path/to/your/logs/celeryd.log
stderr_logfile=/path/to/your/logs/celeryd.log
Now start supervisor with supervisord command in your terminal & use supervisorctl to manage process.
To restart you can do
$ supervisorctl restart celeryworker
I've found answer in celery FAQ
http://docs.celeryproject.org/en/2.2/faq.html#how-do-i-shut-down-celeryd-safely
Use the TERM signal, and the worker will finish all currently
executing jobs and shut down as soon as possible. No tasks should be
lost.
You should never stop celeryd with the KILL signal (-9), unless you’ve
tried TERM a few times and waited a few minutes to let it get a chance
to shut down. As if you do tasks may be terminated mid-execution, and
they will not be re-run unless you have the acks_late option set
(Task.acks_late / CELERY_ACKS_LATE).

Celery unregistered task KeyError

I start the worker by executing the following in the terminal:
celery -A cel_test worker --loglevel=INFO --concurrency=10 -n worker1.%h
Then I get a long looping error message stating that celery has received an unregistered task and has triggered:
KeyError: 'cel_test.grp_all_w_codes.mk_dct' #this is the name of the task
The problem with this is that cel_test.grp_all_w_codes.mk_dct doesn't exist. In fact there isn't even a module cel_test.grp_all_w_codes let alone the task mk_dct. There was once a few days ago but I've since deleted it. I thought maybe there was a .pyc file floating around but there isn't. I also can't find a single reference in my code to the task that's throwing the error. I shut down my computer and restarted the rabbitmq server thinking maybe a reference to something was just stuck in memory but it did not help.
Does anyone have any idea what could be the problem here or what I'm missing?
Well, without knowing your conf files, I can see two reasons that would provoke this:
the mk_dct task wasn't completed when you stopped the worker and delete the module. If you're running with CELERY_ACKS_LATE, it will try to relaunch the task everytime you re run the worker. Try remove this setting, or launch the worker with the purge option.
celery -A cel_test worker --loglevel=INFO --concurrency=10 -n worker1.%h --purge
the mk_dct task is launched by your celery beat. If so, try relaunching celery beat and clearing it's database backend if you had a custom one.
If it does not solve the problem, please post your celery conf, and make sure you have cleaned all the .pyc of your project and restarted everything.

Celery revocations are lost on rabbitMQ restart

We're using celery eta tasks to schedule tasks FAR (like months) in the future.
Now using the rabbitMQ backend because the mongo backend did loose such tasks on a worker restart.
Actually tasks with the rabbitMQ backend seem to be persistent across celery and rabbitMQ restarts, BUT revoke messages seem to be lost on rabbitMQ restarts.
I guess that if revoke messages are lost, those eta tasks that should be killed will execute anyway.
This may be helpful from the documentation (Persistent Revokes):
The list of revoked tasks is in-memory so if all workers restart the
list of revoked ids will also vanish. If you want to preserve this
list between restarts you need to specify a file for these to be
stored in by using the –statedb argument to celery worker:
$ celery -A proj worker -l info --statedb=/var/run/celery/worker.state

Categories

Resources