We're using celery eta tasks to schedule tasks FAR (like months) in the future.
Now using the rabbitMQ backend because the mongo backend did loose such tasks on a worker restart.
Actually tasks with the rabbitMQ backend seem to be persistent across celery and rabbitMQ restarts, BUT revoke messages seem to be lost on rabbitMQ restarts.
I guess that if revoke messages are lost, those eta tasks that should be killed will execute anyway.
This may be helpful from the documentation (Persistent Revokes):
The list of revoked tasks is in-memory so if all workers restart the
list of revoked ids will also vanish. If you want to preserve this
list between restarts you need to specify a file for these to be
stored in by using the –statedb argument to celery worker:
$ celery -A proj worker -l info --statedb=/var/run/celery/worker.state
Related
A celery worker created with one file of tasks "remembers" tasks from a previous celery worker which are not in the file of tasks that I am now using.
At a time in the past, I created a celery worker using a file containing tasks called 'tasks.py.'
celery -A tasks worker --loglevel=info
All was well. Now, having moved on, I am attempting to create a celery worker from a different file of tasks called 'dwtasks.py.'
celery -A dwtasks worker --loglevel=info
The splash screen that comes up when the new worker starts lists the tasks defined in 'dwtasks.py' and all the tasks that were defined in 'tasks.py.' It will also fail to create the worker if 'tasks.py' is not available. If I make no reference to 'tasks.py,' how is it possible for all future celery workers to know about those tasks?
Celery is v4.3.0 (rhubarb)
OS is Ubuntu 16.04
You most likely misconfigured Celery. Check your configuration and see whether you use auto-discovery, and where your celery imports are...
i have celery running on few computers and using flower for monitoring.
the computers is used by different people.
celery beat is generating jobs for all the workers from one of the computer.
every time new coded task is ready, all the workers less the beat-computer will have task not registered exception.
what is the recommended direction to sync all the code to all other computers in the network, is there a prehook kind of mechanism in celery to check for new code?
Unfortunately, you need to update the code on all the workers (nodes) and after that you need to restart all of them. This is by (good) design.
A clever systemd service could in theory be able to
send the graceful shutdown signal
run pip install -U your-project
start the Celery service
I would like to run APScheduler which is a part of WSGI (via Apache's modwsgi with 3 workers) webapp. I am new in WSGI world thus I would appreciate if you could resolve my doubts:
If APScheduler is a part of webapp - it becomes alive just after first request (first after start/reset Apache) which is run at least by one worker? Starting/resetting Apache won't start it - at least one request is needed.
What about concurrent requests - would every worker run same set of APScheduler's tasks or there will be only one set shared between all workers?
Would once running process (webapp run via worker) keep alive (so APScheduler's tasks will execute) or it could terminate after some idle time (as a consequence - APScheduler's tasks won't execute)?
Thank you!
You're right -- the scheduler won't start until the first request comes in.
Therefore running a scheduler in a WSGI worker is not a good idea. A better idea would be to run the scheduler in a separate process and connect to the scheduler when necessary via some RPC mechanism like RPyC or Execnet.
I have a unit-tests for my django project.
Some of views in my django project run celery tasks and I want to check database after these tasks.
I have a separated tests for the celery tasks, where I call them without .delay() method.
The main problem, what is the best and cleanest way to have a celery worker during the jenkins job?
Currently I just run nohup celery -A myqpp worker & before test and kill all running celery at the end of the job.
The best and cleanest way is not to have any celery workers during the Jenkins job, neither any queue/result backend. Utilize CELERY_ALWAYS_EAGER setting to execute your tasks in unit tests locally by blocking until the task returns.
Check out more in Celery documentation: CELERY_ALWAYS_EAGER docs
Just to extend answer about always eager mode, you can see my answer on other question, how you can run celery worker from test setUp https://stackoverflow.com/a/42107423/590233
But few tings need to be done there:
Connect celery worker to test db
Somehow run message broker instance ... (i think that you run it already before test, but cleanest way is to spawn broker instance from setUp as an celery worker)
I'm using periodic celery tasks with Django. I used to have the following task in my app/tasks.py file:
#periodic_task(run_every=timedelta(minutes=2))
def stuff():
...
But now this task has been removed from my app/tasks.py file. However, I keep seeing call to this task in my celery logs:
[2013-05-21 07:08:37,963: ERROR/MainProcess] Received unregistered task of type u'app.tasks.stuff'.
It seems that the celery beat scheduler that I use does not update its queue. This is how the scheduler is defined in my project/settings.py file:
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
Restarting the celery worker does not help. FYI, I use a Redis broker.
How can I either clear or update the celery beat queue so that older tasks are not sent to my celery worker?
Install django-celery.
As cited, this project is not needed to use celery but yet you need this to enable the admin interface at /admin/djcelery/ for managing periodic tasks. Initially there won't be no registered or periodic tasks.
Restart the beat and check the table Periodic tasks again. Beat would have added the existing scheduled tasks into that table with the interval or crontab defined in the settings or the decorators. There you can delete the unwanted tasks.
UPDATE: From celery4, it's recommended to use this package. https://github.com/celery/django-celery-beat
Delete the .pyc file for where the task was originally written. Or, just delete all .pyc files in your projects directory.
This command should work:
find . -name "*.pyc" -exec rm -rf {} \;
How do I remove all .pyc files from a project?