Stack:
Python v2.7
Django v1.11
Celery v4.3.0
Gunicorn v19.7.1
Nginx v1.10
When I try to run django server and celery manually the async tasks executes as expected.
The problem comes when I am deploying django project using Gunicorn plus Nginx.
I tried running Celery using supervisor but it didn't help.
views.py
def _functionA():
_functionB.delay() #where _functionB is registered async task.
settings.py
# Celery settings
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'
celery_init.py
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'cpi_server.settings')
app = Celery('myproject')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
__init__.py
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from myproject.celery_init import app as celery_app
__all__ = ['celery_app']
gunicorn.service
[Unit]
Description=Gunicorn application server....
After=network.target
[Service]
User=root
Group=www-data
WorkingDirectory=<myprojectdir>
Environment=PYTHONPATH=<ENV>
ExecStart=<myprojectdir>/env/bin/gunicorn --workers 3 --access-logfile access_gunicorn.log --error-logfile error_gunicorn.log --capture-output --log-level debug --bind unix:<myprojectdir>/myproject.sock <myproject>.wsgi:application
[Install]
WantedBy=multi-user.target
myproject_nginx.conf
server {
listen 8001;
location / {
include proxy_params;
proxy_pass http://unix:<myprojectdir>/myproject.sock;
}
}
celery worker
celery worker -B -l info -A myproject -Q celery,queue1,queue2,queue3 -n beat.%h -c 1
Can anyone help me with my question(s) below:
Why is that when Django is deployed using Gunicorn and nginx the Celery worker doesn't executes tasks whereas when ran manually it is able to execute the tasks i.e. when ran with python manage.py runserver ... .
You have concurrency level equal to 1 (the -c 1 in your worker command line). This basically means the worker is configured to run A SINGLE TASK at any point of time. If your tasks are long-running, then you may be under impression Celery is not running anything...
You can easily test this - when you start some task, run the following:
celery -A myproject inspect active
That will list you running tasks (if any).
Another thing to fix are your configuration varibles. Celery 4 now expects all configuration variables to be lower-case. Read the What’s new in Celery 4.0 (latentcall) document for more information, especially the Lowercase setting names section.
Related
I have a FastAPI application deployed on DigitalOcean, it has multiple API endpoints and in some of them, I have to run a scraping function as a background job using the RQ package in order not to keep the user waiting for a server response.
I've already managed to create a Redis database on DigitalOcean and successfully connect the application to it, but I'm facing issues with running the RQ worker.
Here's the code, inspired from RQ's official documentation :
import redis
from rq import Worker, Queue, Connection
listen = ['high', 'default', 'low']
#connecting to DigitalOcean's redis db
REDIS_URL = os.getenv('REDIS_URL')
conn = redis.Redis.from_url(url=REDIS_URL)
#Create a RQ queue using the Redis connection
q = Queue(connection=conn)
with Connection(conn):
worker = Worker([q], connection=conn) #This instruction works fine
worker.work() #The deployment fails here, the DigitalOcean server crashes at this instruction
The worker/job execution runs just fine locally but fails in DO's server
To what could this be due? is there anything I'm missing or any kind of configuration that needs to be done on DO's endpoint?
Thank you in advance!
I also tried to use FastAPI's BackgroundTask class. At first, it was running smoothly but the job stops running halfway through with no feedback on what was happening in the background from the class itself. I'm guessing it's due to a timeout that doesn't seem to have a custom configuration in FastAPI (perhaps because its background tasks are meant to be low-cost and fast).
I'm also thinking of trying Celery out, but I'm afraid I would run into the same issues as RQ.
Create a configuration file using this command:
sudo nano /etc/systemd/system/myproject.service
[Unit]
Description=Gunicorn instance to serve myproject
After=network.target
[Service]
User=user
Group=www-data
WorkingDirectory=/home/user/myproject
Environment="PATH=/home/user/myproject/myprojectvenv/bin"
ExecStart=/home/user/myproject/myprojectvenv/bin/gunicorn --workers 3 --bind unix:myproject.sock -m 007 wsgi:app
[program:rq_worker]
command=/home/user/myproject/myprojectvenv/bin/rq -A rq_worker -l info
directory=/home/user/myproject
autostart=true
autorestart=true
stderr_logfile=/var/log/celery.err.log
stdout_logfile=/var/log/celery.out.log
[Install]
WantedBy=multi-user.target
When I run celery -A app.celery worker --loglevel=INFO --pidfile='' I get back the following:
Usage: celery [OPTIONS] COMMAND [ARGS]...
Error: Invalid value for '-A' / '--app':
Unable to load celery application.
'nonetype' object has no attribute '_instantiate_plugins'
To the best of my understanding, in celery -A [name].celery... [name] should be the file where the Celery instance is created and held, which in my case is app.py.
This is my first time working with Celery, so would love help here!
My file structure is as follows:
--app
-- app.py
-- celery_config
-- __init__.py
-- flask_celery.py
app.py
from flask import Flask
from celery_config.flask_celery import make_celery
# Create app
app = Flask(__name__)
app.config.from_envvar("APP_SETTINGS")
...
# Setup Celery
celery = make_celery(app)
flask_celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
def make_celery(app):
celery = Celery(app.import_name)
celery.conf.update(
result_backend=app.config["CELERY_RESULT_BACKEND"],
broker_url=app.config["CELERY_BROKER_URL"],
timezone="UTC",
task_serializer="json",
accept_content=["json"],
result_serializer="json"
)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
I figured it out.
Turns out since I was running the command from inside the app directory I needed to run celery -A celery worker --loglevel=INFO --pidfile='' rather than celery -A app.celery worker --loglevel=INFO --pidfile='' -- -A app searches for celery within the directory app, but I was already in that directory. I only realized this after finding this GitHub comment.
UPDATE: This was also not the answer, the issue was that I was expecting running the Celery worker to pick up my .env variables, which it doesn't do because it's not a Flask-specific package. I had to export my .env variables into my local environment because it was trying to instantiate the app's database without the DATABASE_URL variable. celery -A app.celery worker --loglevel=INFO --pidfile='' was the right command.
I have a Django application where I defined a few #task functions under task.py to execute at given periodic task. I'm 100% sure that the issue is not caused by task.py or any code related but due to some configuration may be in settings.py or my celery worker.
Task does execute at periodic task but at multiple times.
Here are the celery worker logs:
celery -A cimexmonitor worker --loglevel=info -B -c 4
[2019-09-19 21:22:16,360: INFO/ForkPoolWorker-5] Project Monitor Started : APPProject1
[2019-09-19 21:22:16,361: INFO/ForkPoolWorker-4] Project Monitor Started : APPProject1
[2019-09-19 21:25:22,108: INFO/ForkPoolWorker-4] Project Monitor DONE : APPProject1
[2019-09-19 21:25:45,255: INFO/ForkPoolWorker-5] Project Monitor DONE : APPProject1
[2019-09-20 00:22:16,395: INFO/ForkPoolWorker-4] Project Monitor Started : APPProject2
[2019-09-20 00:22:16,398: INFO/ForkPoolWorker-5] Project Monitor Started : APPProject2
[2019-09-20 01:22:11,554: INFO/ForkPoolWorker-5] Project Monitor DONE : APPProject2
[2019-09-20 01:22:12,047: INFO/ForkPoolWorker-4] Project Monitor DONE : APPProject2
If you check above time interval, tasks.py executes one task but 2 workers of celery takes the task & executes the same task at the same interval. I'm not sure why 2 workers took for one task?
settings.py
..
..
# Internationalization
# https://docs.djangoproject.com/en/2.1/topics/i18n/
LANGUAGE_CODE = 'en-us'
TIME_ZONE = 'Asia/Kolkata'
USE_I18N = True
USE_L10N = True
USE_TZ = True
..
..
..
######## CELERY : CONFIG
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ENABLE_UTC = True
CELERYBEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
celery.py
from __future__ import absolute_import, unicode_literals
from celery import Celery
import os
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE','cimexmonitor.settings')
## set the default Django settings module for the 'celery' program.
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app = Celery('cimexmonitor')
#app.config_from_object('django.conf:settings', namespace='CELERY')
app.config_from_object('django.conf:settings')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks(settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
Other information:
→ celery --version
4.3.0 (rhubarb)
→ redis-server --version
Redis server v=3.0.6 sha=00000000:0 malloc=jemalloc-3.6.0 bits=64 build=7785291a3d2152db
django-admin-interface==0.9.2
django-celery-beat==1.5.0
Please help me the ways to debug the problem:
Thanks
Both the worker and beat services need to be running at the same time to execute periodically task as per https://github.com/celery/django-celery-beat
WORKER:
$ celery -A [project-name] worker --loglevel=info -B -c 5
Django scheduler:
celery -A [project-name] beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
I was running both worker,database scheduler at same time, which was said as per the documentation, Which was causing the issues to be executed at the same time, I'm really not sure how celery worker started working as a DB scheduler at the same time.
just running celery worker solved my problem.
From the official documentation: Ensuring a task is only executed one at a time.
Also, I hope you are not running multiple workers the same way (celery -A cimexmonitor worker --loglevel=info -B -c 4) as that would mean you have multiple celery beats scheduling tasks to run... In short - make sure you only have one Celery beat running!
Using celery 3.1.25 with Django 1.10
I can get celery to run tasks by manually getting into the shell and manually launching the tasks. However when I set a task from the django/admin/PeriodicTasks (run every minute), these tasks are not picked up by celery.
I'm using flower to check the status but I don't see any failing task.
The broker node is called celery#USER-vm instead of default, so I don't know if that is affecting this.
my command to run celery is
python manage.py celery -A proj worker --loglevel=INFO -B
Any insight of where to look? MY best guess is that djcelery is not connected to rabbitmq, but not sure where to make those changes.
Thanks!
EDIT:
Settings.py
BROKER_URL="amqp://guest:guest#localhost//"
CELERY_BROKER_URL="amqp://guest:guest#localhost//"
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
CELERY_SEND_TASK_ERROR_EMAILS=True
CELERYD_CONCURRENCY=8
CELERY_TASK_RESULT_EXPIRES=None
CELERY_ACCEPT_CONTENT = ['json', 'application','msgpack', 'yaml']
CELERY_DEFAULT_QUEUE='default'
CELERY_DEFAULT_EXCHANGE_TYPE='direct'
CELERY_DEFAULT_ROUTING_KEY='default'
CELERY_ENABLE_UTC=True
From celery.py
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE','proj.settings')
app = Celery('proj')
app.conf.update(
CELERY_TASK_RESULT_EXPIRES=3600,
)
app.autodiscover_tasks(settings.INSTALLED_APPS, related_name='tasks')
Here's my celery app config:
from __future__ import absolute_import
from celery import Celery
import os
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'tshirtmafia.settings')
app = Celery('tshirtmafia')
app.conf.update(
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend',
)
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
settings.py:
INSTALLED_APPS:
'kombu.transport.django',
'djcelery',
also:
BROKER_URL = 'django://'
Here's my task:
#shared_task
def test():
send_mail('nesamone bus', 'Files have been successfully generated.', 'marijus.merkevicius#gmail.com',
['marijus.merkevicius#gmail.com'], fail_silently=False)
Now when I run locally python manage.py celeryd locally and then run test.delay() from shell locally it works.
Now I'm trying to deploy my app. When with the exact same configuration I try to open python manage.py celeryd and in other window I open shell and run test task, it doesn't work.
I've also tried to setup background daemon like this:
/etc/default/celeryd configuration:
# Name of nodes to start, here we have a single node
CELERYD_NODES="w1"
# or we could have three nodes:
#CELERYD_NODES="w1 w2 w3"
# Where to chdir at start. (CATMAID Django project dir.)
CELERYD_CHDIR="/home/tshirtnation/"
# Python interpreter from environment. (in CATMAID Django dir)
ENV_PYTHON="/usr/bin/python"
# How to call "manage.py celeryd_multi"
CELERYD_MULTI="$ENV_PYTHON $CELERYD_CHDIR/manage.py celeryd_multi"
# How to call "manage.py celeryctl"
CELERYCTL="$ENV_PYTHON $CELERYD_CHDIR/manage.py celeryctl"
# Extra arguments to celeryd
CELERYD_OPTS="--time-limit=300 --concurrency=1"
# Name of the celery config module.
CELERY_CONFIG_MODULE="celeryconfig"
# %n will be replaced with the nodename.
CELERYD_LOG_FILE="/var/log/celery/%n.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"
# Workers should run as an unprivileged user.
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="settings"
And I use default celery /etc/init.d/celeryd script.
So basically it seems like celeryd starts but doesn't work. No idea how to debug this and what might be wrong.
Let me know if you need anything else
Celery turned to be a very capricious child in Django robust system as for me.
There are too little initial data for understanding the reason of your problems.
The most usual reason of Celery daemon fail is file system permissions.
But to clarify the reason I'd try:
Start celery from a command line by the user-owner of django project:
celery -A proj worker -l info
If it works OK, go further
Start celery in a verbal mode as a root user just like daemon to be:
sudo sh -x /etc/init.d/celeryd start
This will show most of the problems with the daemon script - celery user and group used, but not all, unfortunately: permission fails are not visible.
My little remark.
Usually Celery is started by own celery user, and the django project by another one. After long fighting celery and system, I refused from celery user, and owned celery process by the django project user.
And .. do not forget to start once
update-rc.d celerybeat defaults
update-rc.d celeryd defaults
this is for Ubuntu daemon start, sure.
Good luck