I am using Celery 3.1.16 broker (running RabbitMQ) and multiple Celery workers with celeryd daemonized through supervisor. Problem is with tasks update. When I update my tasks.py file, celery worker runs old code.
Celery launch command:
/home/my_project/bin/celery -B --autoreload --app=my_app.celery:app worker --loglevel=INFO
I include tasks file in django settings.py:
CELERY_IMPORTS = [
'my_app.tasks'
]
pyinotify is installed and works (I guess so), part of celery logs:
[2014-12-16 20:56:00,016: INFO/MainProcess] Task my_app.tasks.periodic_update_task_statistic[175c2557-7c07-43c3-ac70-f4e115344134] succeeded in 0.00816309102811s: 'ok!'
[2014-12-16 20:56:11,157: INFO/MainProcess] Detected modified modules: ['my_app.tasks']
[2014-12-16 20:57:00,001: INFO/Beat] Scheduler: Sending due task my_app.tasks.periodic_update_task_statistic (my_app.tasks.periodic_update_task_statistic)
[2014-12-16 20:57:00,007: INFO/MainProcess] Received task: my_app.tasks.periodic_update_task_statistic[f22998a9-dcb4-4c29-8086-86dd6e57eae1]
So, my question: how to get celery update and apply new tasks code, if they were modified?
Celery has open issue with this problem
https://github.com/celery/celery/issues/1025
I have this same problem. While I don't like it, I do the following, which first removes and compiled .pyc files anywhere under my current directory, then restarts all the workers.
find . -name "*.pyc" -exec rm {} \;
supervisorctl restart all
It seems strange that the --autoreload flag does nothing, but it doesn't in my case.
Celery only autoreloads those modules that it directly loaded, it does not keep an eye on other modules that loaded by the direct modules.
Related
Problem statement: The celery beat is sending the scheduled task on time. But the worker is not able to receive the task and execute it.
I am using the following celery version
django-celery-beat==2.2.0
celery==4.4.0
django-celery==3.3.0
The command is being used for celery-beat
celery -A project_path.dev beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
The command is being used for celery-worker
celery worker -A project_path.dev --pool=solo -Q celery -l info
task.py
#periodic_task(run_every=(crontab(minute='*/30')),
options={'queue': settings.CELERY_QUEUES_DICT["celery-periodic"]})
def celery_task():
print("Executing Task")
celery-beat logs:
[2022-07-03 23:00:00,501: INFO/MainProcess] Scheduler: Sending due task path.to.celery_task (path.to.celery_task)
celery-dev logs:
[tasks]
. path.to.celery_task
I see a couple of other tasks are not getting executed. Can I get some help here to understand the issue?
Your worker command -Q celery specified the worker only process tasks in queue celery. But the queue in your task definition is settings.CELERY_QUEUES_DICT["celery-periodic"]. You should check whether they are pointed to the same queue
Found the issue. The mentioned queue had around 120k pending messages to get processed. Hence this issue.
I actually had issues printing mine on command prompt because I was using the wrong command but I found a link to a project which I forked Project
(If on Mac ) celery -A Project worker --loglevel=info
(If on Windows) celery -A Project worker -l info --pool=solo
Run
celery -A Project worker --loglevel=info
in project folder where your celery object is created.
You can use the file name to get your tasks discovered.
For Example: if your file name is tasks.py and celery object is created in that file.
Command will be:
celery -A tasks worker --loglevel=info
I have django 1.11.4 app and use celery 4.1.0 for background periodic tasks. Celery was daemonized according to documentation and was working fine until... I don't know what happened, basically. It suddenly got broken.
When I execute /etc/init.d/celerybeat start it writes following exception to /var/log/celery/beat.log and halts:
[2017-09-04 18:33:38,485: INFO/MainProcess] beat: Starting...
[2017-09-04 18:33:38,485: INFO/MainProcess] Writing entries...
[2017-09-04 18:33:38,486: CRITICAL/MainProcess] beat raised exception <class 'django.db.utils.InterfaceError'>: InterfaceError
("(0, '')",)
Traceback (most recent call last):
File "/home/hikesadmin/.local/lib/python3.4/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'scheduler'
Here is the full log: https://pastebin.com/92iraMCL
I've remove all my tasks and retained simple celery.py tasks file:
import os
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myapp.settings')
import django
import pymysql
pymysql.install_as_MySQLdb()
django.setup()
from celery import Celery
app = Celery('myapp')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task
def gogo():
print("GOGO")
print(123123123)
But celerybeat still does not work. It prints "123123123" and after that halts with same exception.
I've researched deeper and figured out that the problem is in the --detach modifier. When I launch it without it, it works:
/usr/local/bin/celery beat --app=hike_engine -S django -f /var/log/celery/beat.g -l INFO --workdir=/home/hikesadmin/engine --pidfile=/var/run/celery/beat.pid
When I add --detach, celery goes broken.
Please help me to trace and fix the problem. Thanks!
I don't think you can start a detached beat instance via --detach IIRC. We are using supervisor to start/stop/restart our beat instance:
Add /etc/supervisor/conf.d/celery-beat.conf
[program:celery-beat]
command=/path/to/env/bin/celery beat -A your.settings.module --loglevel=INFO --workdir /path/to/your/django/project --pidfile=/var/tmp/celery-beat.pid
directory=/path/to/your/django/project
killasgroup=true
stopasgroup=true
user=YOURUSERID
group=YOURUSERGROUP
numprocs=1
stdout_logfile=/var/log/celery/beat.log
stderr_logfile=/var/log/celery/beat.log
autostart=true
autorestart=true
startsecs=10
stopwaitsecs = 600
priority=998
then update the configuration
supervisorctl reread
and finally start the supervised celery beat instance
supervisorctl start celery-beat
init.d is also documented in the celery docs:
http://docs.celeryproject.org/en/latest/userguide/daemonizing.html#init-script-celerybeat
The scenario:
Two unrelated web apps with celery background tasks running on same server.
One RabbitMQ instance
Each web app has its own virtualenv (including celery). Same celery version in both virtualenvs.
I use the following command lines to start a worker and a beat process for each application.
celery -A firstapp.tasks worker
celery -A firstapp.tasks beat
celery -A secondapp.tasks worker --hostname foobar
celery -A secondapp.tasks beat
Now everything seems to work OK, but in the worker process of secondapp I get the following error:
Received unregistered task of type 'firstapp.tasks.do_something'
Is there a way to isolate the two celery's from each other?
I'm using Celery version 3.1.16, BTW.
I believe I fixed the problem by creating a RabbitMQ vhost and configuring the second app to use that one.
Create vhost (and set permissions):
sudo rabbitmqctl add_vhost /secondapp
sudo rabbitmqctl set_permissions -p /secondapp guest ".*" ".*" ".*"
And then change the command lines for the second app:
celery -A secondapp.tasks -b amqp://localhost//secondapp worker
celery -A secondapp.tasks -b amqp://localhost//secondapp beat
I have automated tasks working locally but not reomotely in my django app. I was watching a tutorial and the guy said to stop my worker. but before I did that I put my app in maintenance mode, that didn't work. then I ran
heroku ps:restart
that didn't work, then I ran
heroku ps:stop worker
which outputed
Warning: The dynos in your dyno formation will be automatically restarted.
then I ran
heroku ps:scale worker=1
and still nothing. I remind those who are reading this that it worked locally. What am I missing?
my procfile
web: gunicorn gettingstarted.wsgi --log-file -
worker: celery worker -A blog -l info.
While researching I'm seeing mentions of adding beat to the procfile. 2 mentions in fact but this was not discussed in the tutorial I watched. only time celery beat was mentioned is when I added this to the settings.py file
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
and just in case it makes a difference I'm using the djcelery gui to set periodic tasks not configuring the sceduler in the settings.py like I see in a majority of the examples.
If I run the task in my view and call it, it works. But it wont run if I set it up using djcelery
I read the docs and realized I had to add to my worker procfile
-B
so it now looks like this
celery -A proj worker -B -l info
after I made the change I did this
heroku ps:scale worker=0
then
git add .
git commit -am 'added -B'
git push heroku master
then I
heroku ps:scale worker=1
then so I could see the output from heroku
heroku logs -t -p worker
and created a schedule in my admin and it worked. I saw the output in the console. Hope this helps. NOTE it says it's not recomended for production not sure why but if you know or find out let me know
As you've read in the docs, using the -B option is not recommended for production use, you'd better run celery-beat as a different process.
So it's best practice to run it in the server like :
celery beat -A messaging_router --loglevel=INFO
And if you're using supervisor to keep your processes running, you'd add something like the following to your configuration file.
[program:api_beat]
command=/path/to/v_envs/v_env/bin/celery -A project beat --loglevel=info
autostart=true
autorestart=true
user=your_user (echo $USER)
directory=/path/to/project/root
stdout_logfile=/var/log/supervisor/beat.log
stderr_logfile=/var/log/supervisor/beat.log
redirect_stderr=true
environment=EVN_VAR1="whatever",
The reason for this is as the docs say
You can also start embed beat inside the worker by enabling workers -B option, this is convenient if you will never run more than one worker node, but it’s not commonly used and for that reason is not recommended for production use:
Consider you having more than 1 worker, in maintenance, you need to always be vary of which one of the celery workers you've run with the -B and that definitely can be burden.