I have django 1.11.4 app and use celery 4.1.0 for background periodic tasks. Celery was daemonized according to documentation and was working fine until... I don't know what happened, basically. It suddenly got broken.
When I execute /etc/init.d/celerybeat start it writes following exception to /var/log/celery/beat.log and halts:
[2017-09-04 18:33:38,485: INFO/MainProcess] beat: Starting...
[2017-09-04 18:33:38,485: INFO/MainProcess] Writing entries...
[2017-09-04 18:33:38,486: CRITICAL/MainProcess] beat raised exception <class 'django.db.utils.InterfaceError'>: InterfaceError
("(0, '')",)
Traceback (most recent call last):
File "/home/hikesadmin/.local/lib/python3.4/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'scheduler'
Here is the full log: https://pastebin.com/92iraMCL
I've remove all my tasks and retained simple celery.py tasks file:
import os
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myapp.settings')
import django
import pymysql
pymysql.install_as_MySQLdb()
django.setup()
from celery import Celery
app = Celery('myapp')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task
def gogo():
print("GOGO")
print(123123123)
But celerybeat still does not work. It prints "123123123" and after that halts with same exception.
I've researched deeper and figured out that the problem is in the --detach modifier. When I launch it without it, it works:
/usr/local/bin/celery beat --app=hike_engine -S django -f /var/log/celery/beat.g -l INFO --workdir=/home/hikesadmin/engine --pidfile=/var/run/celery/beat.pid
When I add --detach, celery goes broken.
Please help me to trace and fix the problem. Thanks!
I don't think you can start a detached beat instance via --detach IIRC. We are using supervisor to start/stop/restart our beat instance:
Add /etc/supervisor/conf.d/celery-beat.conf
[program:celery-beat]
command=/path/to/env/bin/celery beat -A your.settings.module --loglevel=INFO --workdir /path/to/your/django/project --pidfile=/var/tmp/celery-beat.pid
directory=/path/to/your/django/project
killasgroup=true
stopasgroup=true
user=YOURUSERID
group=YOURUSERGROUP
numprocs=1
stdout_logfile=/var/log/celery/beat.log
stderr_logfile=/var/log/celery/beat.log
autostart=true
autorestart=true
startsecs=10
stopwaitsecs = 600
priority=998
then update the configuration
supervisorctl reread
and finally start the supervised celery beat instance
supervisorctl start celery-beat
init.d is also documented in the celery docs:
http://docs.celeryproject.org/en/latest/userguide/daemonizing.html#init-script-celerybeat
Related
Problem statement: The celery beat is sending the scheduled task on time. But the worker is not able to receive the task and execute it.
I am using the following celery version
django-celery-beat==2.2.0
celery==4.4.0
django-celery==3.3.0
The command is being used for celery-beat
celery -A project_path.dev beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
The command is being used for celery-worker
celery worker -A project_path.dev --pool=solo -Q celery -l info
task.py
#periodic_task(run_every=(crontab(minute='*/30')),
options={'queue': settings.CELERY_QUEUES_DICT["celery-periodic"]})
def celery_task():
print("Executing Task")
celery-beat logs:
[2022-07-03 23:00:00,501: INFO/MainProcess] Scheduler: Sending due task path.to.celery_task (path.to.celery_task)
celery-dev logs:
[tasks]
. path.to.celery_task
I see a couple of other tasks are not getting executed. Can I get some help here to understand the issue?
Your worker command -Q celery specified the worker only process tasks in queue celery. But the queue in your task definition is settings.CELERY_QUEUES_DICT["celery-periodic"]. You should check whether they are pointed to the same queue
Found the issue. The mentioned queue had around 120k pending messages to get processed. Hence this issue.
I have a Django application where I defined a few #task functions under task.py to execute at given periodic task. I'm 100% sure that the issue is not caused by task.py or any code related but due to some configuration may be in settings.py or my celery worker.
Task does execute at periodic task but at multiple times.
Here are the celery worker logs:
celery -A cimexmonitor worker --loglevel=info -B -c 4
[2019-09-19 21:22:16,360: INFO/ForkPoolWorker-5] Project Monitor Started : APPProject1
[2019-09-19 21:22:16,361: INFO/ForkPoolWorker-4] Project Monitor Started : APPProject1
[2019-09-19 21:25:22,108: INFO/ForkPoolWorker-4] Project Monitor DONE : APPProject1
[2019-09-19 21:25:45,255: INFO/ForkPoolWorker-5] Project Monitor DONE : APPProject1
[2019-09-20 00:22:16,395: INFO/ForkPoolWorker-4] Project Monitor Started : APPProject2
[2019-09-20 00:22:16,398: INFO/ForkPoolWorker-5] Project Monitor Started : APPProject2
[2019-09-20 01:22:11,554: INFO/ForkPoolWorker-5] Project Monitor DONE : APPProject2
[2019-09-20 01:22:12,047: INFO/ForkPoolWorker-4] Project Monitor DONE : APPProject2
If you check above time interval, tasks.py executes one task but 2 workers of celery takes the task & executes the same task at the same interval. I'm not sure why 2 workers took for one task?
settings.py
..
..
# Internationalization
# https://docs.djangoproject.com/en/2.1/topics/i18n/
LANGUAGE_CODE = 'en-us'
TIME_ZONE = 'Asia/Kolkata'
USE_I18N = True
USE_L10N = True
USE_TZ = True
..
..
..
######## CELERY : CONFIG
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ENABLE_UTC = True
CELERYBEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
celery.py
from __future__ import absolute_import, unicode_literals
from celery import Celery
import os
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE','cimexmonitor.settings')
## set the default Django settings module for the 'celery' program.
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app = Celery('cimexmonitor')
#app.config_from_object('django.conf:settings', namespace='CELERY')
app.config_from_object('django.conf:settings')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks(settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
Other information:
→ celery --version
4.3.0 (rhubarb)
→ redis-server --version
Redis server v=3.0.6 sha=00000000:0 malloc=jemalloc-3.6.0 bits=64 build=7785291a3d2152db
django-admin-interface==0.9.2
django-celery-beat==1.5.0
Please help me the ways to debug the problem:
Thanks
Both the worker and beat services need to be running at the same time to execute periodically task as per https://github.com/celery/django-celery-beat
WORKER:
$ celery -A [project-name] worker --loglevel=info -B -c 5
Django scheduler:
celery -A [project-name] beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
I was running both worker,database scheduler at same time, which was said as per the documentation, Which was causing the issues to be executed at the same time, I'm really not sure how celery worker started working as a DB scheduler at the same time.
just running celery worker solved my problem.
From the official documentation: Ensuring a task is only executed one at a time.
Also, I hope you are not running multiple workers the same way (celery -A cimexmonitor worker --loglevel=info -B -c 4) as that would mean you have multiple celery beats scheduling tasks to run... In short - make sure you only have one Celery beat running!
file structure
proj/proj/
celery.py
(and other files)
/sitesettings/
tasks.py
(and other files)
celery.py
app = Celery('mooncake',broker_url = 'amqp://')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
sitesettings/tasks.py
from __future__ import absolute_import, unicode_literals
from comma.models import Post
from mooncake.celery import app
app.conf.beat_schedule = {
'every-5-seconds': {
'task': 'sitesettings.tasks.statisticsTag',
'schedule': 5.0,
'args': ()
},
}
#app.task
def statisticsTag():
print(Post.objects.all()[0])
and run it with
celery -A proj beat -l info
it out put with
[2019-02-22 18:21:08,346: INFO/MainProcess] Scheduler: Sending due task every-5-seconds (sitesettings.tasks.statisticsTag)
but no further output.
I used to try write it in proj/celery.py, but it cannot run cuz I have to import from another app, it exit with "app not loaded" error. So what should I do?
The command you are calling to start celery celery -A proj beat -l info is starting a beat scheduler instance of celery which sends due tasks to a worker instance.
You will also need to start a worker server that will execute those due tasks. You can start a celery worker with the command celery -A proj worker -l info. This will need to be running at the same time as your scheduler is running.
Alternatively you can run a worker with embedded beat scheduler celery -A proj worker -B -l info but that is not recommended for production use.
The scenario:
Two unrelated web apps with celery background tasks running on same server.
One RabbitMQ instance
Each web app has its own virtualenv (including celery). Same celery version in both virtualenvs.
I use the following command lines to start a worker and a beat process for each application.
celery -A firstapp.tasks worker
celery -A firstapp.tasks beat
celery -A secondapp.tasks worker --hostname foobar
celery -A secondapp.tasks beat
Now everything seems to work OK, but in the worker process of secondapp I get the following error:
Received unregistered task of type 'firstapp.tasks.do_something'
Is there a way to isolate the two celery's from each other?
I'm using Celery version 3.1.16, BTW.
I believe I fixed the problem by creating a RabbitMQ vhost and configuring the second app to use that one.
Create vhost (and set permissions):
sudo rabbitmqctl add_vhost /secondapp
sudo rabbitmqctl set_permissions -p /secondapp guest ".*" ".*" ".*"
And then change the command lines for the second app:
celery -A secondapp.tasks -b amqp://localhost//secondapp worker
celery -A secondapp.tasks -b amqp://localhost//secondapp beat
I am using Celery 3.1.16 broker (running RabbitMQ) and multiple Celery workers with celeryd daemonized through supervisor. Problem is with tasks update. When I update my tasks.py file, celery worker runs old code.
Celery launch command:
/home/my_project/bin/celery -B --autoreload --app=my_app.celery:app worker --loglevel=INFO
I include tasks file in django settings.py:
CELERY_IMPORTS = [
'my_app.tasks'
]
pyinotify is installed and works (I guess so), part of celery logs:
[2014-12-16 20:56:00,016: INFO/MainProcess] Task my_app.tasks.periodic_update_task_statistic[175c2557-7c07-43c3-ac70-f4e115344134] succeeded in 0.00816309102811s: 'ok!'
[2014-12-16 20:56:11,157: INFO/MainProcess] Detected modified modules: ['my_app.tasks']
[2014-12-16 20:57:00,001: INFO/Beat] Scheduler: Sending due task my_app.tasks.periodic_update_task_statistic (my_app.tasks.periodic_update_task_statistic)
[2014-12-16 20:57:00,007: INFO/MainProcess] Received task: my_app.tasks.periodic_update_task_statistic[f22998a9-dcb4-4c29-8086-86dd6e57eae1]
So, my question: how to get celery update and apply new tasks code, if they were modified?
Celery has open issue with this problem
https://github.com/celery/celery/issues/1025
I have this same problem. While I don't like it, I do the following, which first removes and compiled .pyc files anywhere under my current directory, then restarts all the workers.
find . -name "*.pyc" -exec rm {} \;
supervisorctl restart all
It seems strange that the --autoreload flag does nothing, but it doesn't in my case.
Celery only autoreloads those modules that it directly loaded, it does not keep an eye on other modules that loaded by the direct modules.