I'm trying to create a periodic task within a Django app.
I added this to my settings.py:
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
'get_checkins': {
'task': 'api.tasks.get_checkins',
'schedule': timedelta(seconds=1)
}
}
I'm just getting started with Celery and haven't figured out which broker I want to use, so I added this as well to just bypass the broker for the time being:
if DEBUG:
CELERY_ALWAYS_EAGER = True
I also created a celery.py file in my project folder:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'testproject.settings')
app = Celery('testproject')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
Inside my app, called api, I made a tasks.py file:
from celery import shared_task
#shared_task
def get_checkins():
print('hello from get checkins')
I'm running the worker and beat with celery -A testproject worker --beat -l info
It starts up fine and I can see the task is registered under [tasks], but I don't see any jobs getting logged. Should be one per second. Can anyone tell why this isn't executing?
I looked at your post and don't see any comment on the broker you are using along with celery.
Have you installed a broker like Rabbitmq? Is it running or logging some kind of error?
Celery needs a broker to send and receive data.
Check the documentation here (http://docs.celeryproject.org/en/latest/getting-started/first-steps-with-celery.html#choosing-a-broker)
Related
I want to monitor/report task statuses, but tasks are saved only when all tasks are done. I want them to be saved as soon as they start.
'''inside my Queue project'''
celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'Queue.settings')
app = Celery('Queue',
broker='redis://localhost:6379',
backend='django-db',
task_track_started=True,
include=['index.tasks'])
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
inside "index" app
tasks.py
from __future__ import absolute_import, unicode_literals
from celery import shared_task, current_task
import time
#shared_task
def gozle():
time.sleep(15)
return 1
views.py
def index(request):
gozle.delay()
return HttpResponse('admin')
I expect that as I visit index page my task be triggered and be recorded into ResultTask, but it waits 15 seconds then be recorded.
The start of the task should be acknowledged immediately, you should see in your celery server a log entry that says that gozle task started, then it wait 15 seconds and will exit, this exit state will also be on the celery server log. If that is not the case maybe it has something to do with task_acks_late enabled in celery config, here is the doc, making it only acknowledge tasks after they end.
Unable to run periodic tasks along with asynchronous tasks together. Although, if I comment out the periodic task, asynchronous tasks are executed fine, else asynchronous tasks are stuck.
Running: celery==4.0.2, Django==2.0, django-celery-beat==1.1.0, django-celery-results==1.0.1
Referred: https://github.com/celery/celery/issues/4184 to choose celery==4.0.2 version, as it seems to work.
Seems to be a known issue
https://github.com/celery/django-celery-beat/issues/27
I've also done some digging the ONLY way I've found to get it back to
normal is to remove all periodic tasks and restart celery beat. ~ rh0dium
celery.py
import django
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'bid.settings')
# Setup django project
django.setup()
app = Celery('bid')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
settings.py
INSTALLED_APPS = (
...
'django_celery_results',
'django_celery_beat',
)
# Celery related settings
CELERY_BROKER_URL = 'redis://localhost:6379/0'
CELERY_BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': 43200, }
CELERY_RESULT_BACKEND = 'django-db'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'
CELERY_CONTENT_ENCODING = 'utf-8'
CELERY_ENABLE_REMOTE_CONTROL = False
CELERY_SEND_EVENTS = False
CELERY_TIMEZONE = 'Asia/Kolkata'
CELERY_BEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
Periodic task
#periodic_task(run_every=crontab(hour=7, minute=30), name="send-vendor-status-everyday")
def send_vendor_status():
return timezone.now()
Async task
#shared_task
def vendor_creation_email(id):
return "Email Sent"
Async task caller
vendor_creation_email.apply_async(args=[instance.id, ]) # main thread gets stuck here, if periodic jobs are scheduled.
Running the worker, with beat as follows
celery worker -A bid -l debug -B
Please help.
Here are a few observations, resulted from multiple trial and errors, and diving into celery's source code.
#periodic_task is deprecated. Hence it would not work.
from their source code:
#venv36/lib/python3.6/site-packages/celery/task/base.py
def periodic_task(*args, **options):
"""Deprecated decorator, please use :setting:`beat_schedule`."""
return task(**dict({'base': PeriodicTask}, **options))
Use UTC as base timezone, to avoid timezone related confusions later on. Configure periodic task to fire on calculated times with respect to UTC. e.g. for 'Asia/Calcutta' reduce the time by 5hours 30mins.
Create a celery.py as follows:
celery.py
import django
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
from celery.schedules import crontab
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
# Setup django project
django.setup()
app = Celery('proj')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
app.conf.beat_schedule = {
'test_task': {
'task': 'test_task',
'schedule': crontab(hour=2,minute=0),
}
}
and task could be in tasks.py under any app, as follows
#shared_task(name="test_task")
def test_add():
print("Testing beat service")
Use celery worker -A proj -l info and celery beat -A proj -l info for worker and beat, along with a broker e.g. redis. and this setup should work fine.
I have a flask app that roughly looks like this:
app = Flask(__name__)
#app.route('/',methods=['POST'])
def foo():
data = json.loads(request.data)
# do some stuff
return "OK"
Now in addition I would like to run a function every ten seconds from that script. I don't want to use sleep for that. I have the following celery script in addition:
from celery import Celery
from datetime import timedelta
celery = Celery('__name__')
CELERYBEAT_SCHEDULE = {
'add-every-30-seconds': {
'task': 'tasks.add',
'schedule': timedelta(seconds=10)
},
}
#celery.task(name='tasks.add')
def hello():
app.logger.info('run my function')
The script works fine, but the logger.info is not executed. What am I missing?
Do you have Celery worker and Celery beat running? Scheduled tasks are handled by beat, which queues the task mentioned when appropriate. Worker then actually crunches the numbers and executes your task.
celery worker --app myproject--loglevel=info
celery beat --app myproject
Your task however looks like it's calling the Flask app's logger. When using the worker, you probably don't have the Flask application around (since it's in another process). Try using a normal Python logger for the demo task.
Well, celery beat can be embedded in regular celery worker as well, with -B parameter in your command.
celery -A --app myproject --loglevel=info -B
It is only recommended for the development environment. For production, you should run beat and celery workers separately as documentation mentions. Otherwise, your periodic task will run more than one time.
A celery task by default will run outside of the Flask app context and thus it won't have access to Flask app instance. However it's very easy to create the Flask app context while running a task by using app_context method of the Flask app object.
app = Flask(__name__)
celery = Celery(app.name)
#celery.task
def task():
with app.app_context():
app.logger.info('running my task')
This article by Miguel Grinberg is a very good place to get a primer on the basics of using Celery in a Flask application.
First install the redis on machine and check it is running or not.
install the python dependencies
celery
redis
flask
folder structure
project
app
init.py
task.py
main.py
write task.py
from celery import Celery
from celery.schedules import crontab
from app import app
from app.scrap import product_data
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
def make_celery(app):
#Celery configuration
app.config['CELERY_BROKER_URL'] = 'redis://127.0.0.1:6379'
app.config['CELERY_RESULT_BACKEND'] = 'db+postgresql://user:password#172.17.0.3:5432/mydatabase'
app.config['CELERY_RESULT_EXTENDED']=True
app.config['CELERYBEAT_SCHEDULE'] = {
# Executes every minute
'periodic_task-every-minute': {
'task': 'periodic_task',
'schedule': crontab(minute="*")
}
}
celery = Celery(app.import_name, broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
celery = make_celery(app)
#celery.task(name="periodic_task",bind=True)
def testing(self):
file1 = open("../myfile.txt", "a")
# writing newline character
file1.write("\n")
file1.write("Today")
#faik
print("Running")
self.request.task_name = "state"
logger.info("Hello! from periodic task")
return "Done"
write init.py
from flask import Flask, Blueprint,request
from flask_restx import Api,Resource,fields
from flask_sqlalchemy import SQLAlchemy
import redis
from rq import Queue
app = Flask(__name__)
app.config['SECRET_KEY']='7c09ebc8801a0ce8fb82b3d2ec51b4db'
app.config['SQLALCHEMY_DATABASE_URI']='sqlite:///site.db'
db=SQLAlchemy(app)
command to run celery beat and worker
celery -A app.task.celery beat
celery -A app.task.celery worker --loglevel=info
I would like to launch a periodic task every second but only if the previous task ended (db polling to send task to celery).
In the Celery documentation they are using the Django cache to make a lock.
I tried to use the example:
from __future__ import absolute_import
import datetime
import time
from celery import shared_task
from django.core.cache import cache
LOCK_EXPIRE = 60 * 5
#shared_task
def periodic():
acquire_lock = lambda: cache.add('lock_id', 'true', LOCK_EXPIRE)
release_lock = lambda: cache.delete('lock_id')
a = acquire_lock()
if a:
try:
time.sleep(10)
print a, 'Hello ', datetime.datetime.now()
finally:
release_lock()
else:
print 'Ignore'
with the following configuration:
app.conf.update(
CELERY_IGNORE_RESULT=True,
CELERY_ACCEPT_CONTENT=['json'],
CELERY_TASK_SERIALIZER='json',
CELERY_RESULT_SERIALIZER='json',
CELERYBEAT_SCHEDULE={
'periodic_task': {
'task': 'app_task_management.tasks.periodic',
'schedule': timedelta(seconds=1),
},
},
)
But in the console, I never see the Ignore message and I have Hello every second. It seems that the lock is not working fine.
I launch the periodic task with:
celeryd -B -A my_app
and the worker with:
celery worker -A my_app -l info
Could you please correct my misunderstanding?
From the Django Cache Framework documentation about local-memory cache:
Note that each process will have its own private cache instance, which
means no cross-process caching is possible.
So basically your workers are each dealing with their own cache. If you need a low resource cost cache backend I would recommend File Based Cache or Database Cache, both allow cross-process.
I'm really struggling to set up a periodic task using Celery Beat on Windows 7 (unfortunately that is what I'm dealing with at the moment). The app that will be using celery is written with CherryPy, so the Django libraries are not relevant here. All I'm looking for is a simple example of how to start the Celery Beat Process in the background. The FAQ section says the following, but I haven't been able to actually do it yet:
Windows
The -B / –beat option to worker doesn’t work?¶
Answer: That’s right. Run celery beat and celery worker as separate services instead.
My project layout is as follows:
proj/
__init__.py (empty)
celery.py
celery_schedule.py
celery_settings.py (these work
tasks.py
celery.py:
from __future__ import absolute_import
from celery import Celery
from proj import celery_settings
from proj import celery_schedule
app = Celery(
'proj',
broker=celery_settings.BROKER_URL,
backend=celery_settings.CELERY_RESULT_BACKEND,
include=['proj.tasks']
)
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_TASK_RESULT_EXPIRES=3600,
CELERYBEAT_SCHEDULE=celery_schedule.CELERYBEAT_SCHEDULE
)
if __name__ == '__main__':
app.start()
tasks.py
from __future__ import absolute_import
from proj.celery import app
#app.task
def add(x, y):
return x + y
celery_schedule.py
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
'add-every-30-seconds': {
'task': 'tasks.add',
'schedule': timedelta(seconds=3),
'args': (16, 16)
},
}
Running "celery worker --app=proj -l info" from the command line (from the parent directory of "proj") starts the worker thread just fine and I can execute the add task from the Python terminal. However, I just can't figure out how to start the beat service. Obviously the syntax is probably incorrect as well because I haven't gotten past the missing --beat option.
Just start another process via a new terminal window, make sure you are in the correct directory and execute the command celery beat (no '--' needed preceding the beat keyword).
If this does not solve your issue, rename your celery_schedule.py file to celeryconfig.py and include it in your celery.py file as: app.config_from_object('celeryconfig') right above your name == main
then spawn a new celery beat process: celery beat