Celery Async Tasks and Periodic Tasks together - python

Unable to run periodic tasks along with asynchronous tasks together. Although, if I comment out the periodic task, asynchronous tasks are executed fine, else asynchronous tasks are stuck.
Running: celery==4.0.2, Django==2.0, django-celery-beat==1.1.0, django-celery-results==1.0.1
Referred: https://github.com/celery/celery/issues/4184 to choose celery==4.0.2 version, as it seems to work.
Seems to be a known issue
https://github.com/celery/django-celery-beat/issues/27
I've also done some digging the ONLY way I've found to get it back to
normal is to remove all periodic tasks and restart celery beat. ~ rh0dium
celery.py
import django
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'bid.settings')
# Setup django project
django.setup()
app = Celery('bid')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
settings.py
INSTALLED_APPS = (
...
'django_celery_results',
'django_celery_beat',
)
# Celery related settings
CELERY_BROKER_URL = 'redis://localhost:6379/0'
CELERY_BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': 43200, }
CELERY_RESULT_BACKEND = 'django-db'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'
CELERY_CONTENT_ENCODING = 'utf-8'
CELERY_ENABLE_REMOTE_CONTROL = False
CELERY_SEND_EVENTS = False
CELERY_TIMEZONE = 'Asia/Kolkata'
CELERY_BEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
Periodic task
#periodic_task(run_every=crontab(hour=7, minute=30), name="send-vendor-status-everyday")
def send_vendor_status():
return timezone.now()
Async task
#shared_task
def vendor_creation_email(id):
return "Email Sent"
Async task caller
vendor_creation_email.apply_async(args=[instance.id, ]) # main thread gets stuck here, if periodic jobs are scheduled.
Running the worker, with beat as follows
celery worker -A bid -l debug -B
Please help.

Here are a few observations, resulted from multiple trial and errors, and diving into celery's source code.
#periodic_task is deprecated. Hence it would not work.
from their source code:
#venv36/lib/python3.6/site-packages/celery/task/base.py
def periodic_task(*args, **options):
"""Deprecated decorator, please use :setting:`beat_schedule`."""
return task(**dict({'base': PeriodicTask}, **options))
Use UTC as base timezone, to avoid timezone related confusions later on. Configure periodic task to fire on calculated times with respect to UTC. e.g. for 'Asia/Calcutta' reduce the time by 5hours 30mins.
Create a celery.py as follows:
celery.py
import django
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
from celery.schedules import crontab
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
# Setup django project
django.setup()
app = Celery('proj')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
app.conf.beat_schedule = {
'test_task': {
'task': 'test_task',
'schedule': crontab(hour=2,minute=0),
}
}
and task could be in tasks.py under any app, as follows
#shared_task(name="test_task")
def test_add():
print("Testing beat service")
Use celery worker -A proj -l info and celery beat -A proj -l info for worker and beat, along with a broker e.g. redis. and this setup should work fine.

Related

Heroku / Redis Connection error on Python/Django Project

I'm trying to set up django so it send automatic email when a certain date in my models i reached. However i setup a Heroku-Redis server and am trying to connect to it. I created a simple task to test out if celery is working but it always returns the following error:
Error while reading from socket: (10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)
I setup celery according to the website:
celery.py:
import os
from celery import Celery
# Set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'incFleet.settings')
app = Celery('incFleet')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django apps.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print(f'Request: {self.request!r}')
Tasks.py
import datetime
from celery import shared_task, task
from time import sleep
from .models import trucks
from datetime import datetime
#shared_task()
def sleepy(duration):
sleep(duration)
return 0
##shared_task()
#def send_warning():
My views:
def index(request):
sleepy.delay(5)
return render(request, 'Inventory/index.html')
And my settings.py
# Celery Broker - Redis
CELERY_BROKER_URL = 'redis://:p0445df1196b44ba70a9bd0c84545315fec8a5dcbd77c8e8c4bd22ff4cd0a2ff4#ec2-54-167-58-171.compute-1.amazonaws.com:11900/'
CELERY_RESULT_BACKEND = 'redis://:p0445df1196b44ba70a9bd0c84545315fec8a5dcbd77c8e8c4bd22ff4cd0a2ff4#ec2-54-167-58-171.compute-1.amazonaws.com:11900/'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
After a while i realized the issue was with the network in my office. The port i was trying to access was closed. As soon as i activated redis on the actual server it started working fine.
heroku config | grep REDIS
check the actual url using this code

Why isn't celery periodic task working?

I'm trying to create a periodic task within a Django app.
I added this to my settings.py:
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
'get_checkins': {
'task': 'api.tasks.get_checkins',
'schedule': timedelta(seconds=1)
}
}
I'm just getting started with Celery and haven't figured out which broker I want to use, so I added this as well to just bypass the broker for the time being:
if DEBUG:
CELERY_ALWAYS_EAGER = True
I also created a celery.py file in my project folder:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'testproject.settings')
app = Celery('testproject')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
Inside my app, called api, I made a tasks.py file:
from celery import shared_task
#shared_task
def get_checkins():
print('hello from get checkins')
I'm running the worker and beat with celery -A testproject worker --beat -l info
It starts up fine and I can see the task is registered under [tasks], but I don't see any jobs getting logged. Should be one per second. Can anyone tell why this isn't executing?
I looked at your post and don't see any comment on the broker you are using along with celery.
Have you installed a broker like Rabbitmq? Is it running or logging some kind of error?
Celery needs a broker to send and receive data.
Check the documentation here (http://docs.celeryproject.org/en/latest/getting-started/first-steps-with-celery.html#choosing-a-broker)

Celery task results not persisted with rpc

I have been trying to get Celery task results to be routed to another process by making results persisted to a queue and another process can pick results from queue. So, have configured Celery as CELERY_RESULT_BACKEND = 'rpc', but still Python function returned value is not persisted to queue.
Not sure if any other configuration or code change required. Please help.
Here is the code example:
celery.py
from __future__ import absolute_import
from celery import Celery
app = Celery('proj',
broker='amqp://',
backend='rpc://',
include=['proj.tasks'])
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_RESULT_BACKEND = 'rpc',
CELERY_RESULT_PERSISTENT = True,
CELERY_TASK_SERIALIZER = 'json',
CELERY_RESULT_SERIALIZER = 'json'
)
if __name__ == '__main__':
app.start()
tasks.py
from proj.celery import app
#app.task
def add(x, y):
return x + y
Running Celery as
celery worker --app=proj -l info --pool=eventlet -c 4
Solved by using Pika (Python implementation of the AMQP 0-9-1 protocol - https://pika.readthedocs.org) to post results back to celeryresults channel

Celery events specific to a queue

I have two Django projects, each with a Celery app:
- fooproj.celery_app
- barproj.celery_app
Each app is running its own Celery worker:
celery worker -A fooproj.celery_app -l info -E -Q foo_queue
celery worker -A barproj.celery_app -l info -E -Q bar_queue
Here's how I am configuring my Celery apps:
import os
from celery import Celery
from django.conf import settings
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings.local')
app = Celery('celery_app', broker=settings.BROKER_URL)
app.conf.update(
CELERY_ACCEPT_CONTENT=['json'],
CELERY_TASK_SERIALIZER='json',
CELERY_RESULT_SERIALIZER='json',
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend',
CELERY_SEND_EVENTS=True,
CELERY_DEFAULT_QUEUE=settings.CELERY_DEFAULT_QUEUE,
CELERY_DEFAULT_EXCHANGE=settings.CELERY_DEFAULT_EXCHANGE,
CELERY_DEFAULT_ROUTING_KEY=settings.CELERY_DEFAULT_ROUTING_KEY,
CELERY_DEFAULT_EXCHANGE_TYPE='direct',
CELERY_ROUTES = ('proj.celeryrouters.MainRouter', ),
CELERY_IMPORTS=(
'apps.qux.tasks',
'apps.lorem.tasks',
'apps.ipsum.tasks',
'apps.sit.tasks'
),
)
My router class:
from django.conf import settings
class MainRouter(object):
"""
Routes Celery tasks to a proper exchange and queue
"""
def route_for_task(self, task, args=None, kwargs=None):
return {
'exchange': settings.CELERY_DEFAULT_EXCHANGE,
'exchange_type': 'direct',
'queue': settings.CELERY_DEFAULT_QUEUE,
'routing_key': settings.CELERY_DEFAULT_ROUTING_KEY,
}
fooproj has settings:
BROKER_URL = redis://localhost:6379/0
CELERY_DEFAULT_EXCHANGE = 'foo_exchange'
CELERY_DEFAULT_QUEUE = 'foo_queue'
CELERY_DEFAULT_ROUTING_KEY = 'foo_routing_key'
barproj has settings:
BROKER_URL = redis://localhost:6379/1
CELERY_DEFAULT_EXCHANGE = 'foo_exchange'
CELERY_DEFAULT_QUEUE = 'foo_queue'
CELERY_DEFAULT_ROUTING_KEY = 'foo_routing_key'
As you can see, both projects use their own Redis database as a broker, their own MySQL database as a result backend, their own exchange, queue and routing key.
I am trying to have two Celery events processes running, one for each app:
celery events -A fooproj.celery_app -l info -c djcelery.snapshot.Camera
celery events -A barproj.celery_app -l info -c djcelery.snapshot.Camera
The problem is, both celery events processes are picking up tasks from all of my Celery workers! So in the fooproj database, I can see task results from barproj database.
Any idea how to solve this problem?
From http://celery.readthedocs.org/en/latest/getting-started/brokers/redis.html:
Monitoring events (as used by flower and other tools) are global and
is not affected by the virtual host setting.
This is caused by a limitation in Redis. The Redis PUB/SUB channels are global and not affected by the database number.
This seems to be one of Redis' caveats :(

Celery tasks in Django are always blocking

I have the following setup in my django settings:
CELERY_TASK_RESULT_EXPIRES = timedelta(minutes=30)
CELERY_CHORD_PROPAGATES = True
CELERY_ACCEPT_CONTENT = ['json', 'msgpack', 'yaml']
CELERY_ALWAYS_EAGER = True
CELERY_EAGER_PROPAGATES_EXCEPTIONS = True
BROKER_URL = 'django://'
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend'
I've included this under my installed apps:
'djcelery',
'kombu.transport.django'
My project structure is (django 1.5)
proj
|_proj
__init__.py
celery.py
|_apps
|_myapp1
|_models.py
|_tasks.py
This is my celery.py file:
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings.dev')
app = Celery('proj')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS, related_name='tasks')
In the main __init__.pyI have:
from __future__ import absolute_import
from .celery import app as celery_app
And finally in myapp1/tasks.py I define my task:
#task()
def retrieve():
# Do my stuff
Now, if I launch a django interactive shell and I launch the retrieve task:
result = retrieve.delay()
it always seems to be a blocking call, meaning that the prompt is bloked until the function returns. The result status is SUCCESS, the function actually performs the operations BUT it seems not to be async. What am I missing?
it seems like CELERY_ALWAYS_EAGER causes this
if this is True, all tasks will be executed locally by blocking until
the task returns. apply_async() and Task.delay() will return an
EagerResult instance, which emulates the API and behavior of
AsyncResult, except the result is already evaluated.
That is, tasks will be executed locally instead of being sent to the
queue.

Categories

Resources