I am having difficulty understanding how to run Celery after setting up some scheduled tasks.
Firstly, my project directory is structured as follows:
blogpodapi\api\__init__.py contains
from tasks import app
import celeryconfig
blogpodapi\api\celeryconfig.py contains
from datetime import timedelta
# Celery settings
CELERY_BROKER_URL = 'redis://localhost:6379/0'
BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/1'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
CELERY_IMPORTS = ("api.tasks",)
CELERYBEAT_SCHEDULE = {
'write-test': {
'task': 'api.tasks.addrandom',
'schedule': timedelta(seconds=2),
'args': (16000, 42)
},
}
blogpodapi\api\tasks.py contains
from __future__ import absolute_import
import random
from celery import Celery
app = Celery('blogpodapi')
#app.task
def add(x, y):
r = x + y
print "task arguments: {x}, {y}".format(x=x, y=y)
print "task result: {r}".format(r=r)
return r
#app.task
def addrandom(x, *args): # *args are not used, just there to be interchangable with add(x, y)
y = random.randint(1,100)
print "passing to add(x, y)"
return add(x, y)
blogpodapi\blogpodapi\__init__.py contains
from __future__ import absolute_import
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app # noqa
blogpodapi\blogpodapi\settings.py contains
...
# Celery settings
CELERY_BROKER_URL = 'redis://localhost:6379/0'
BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/1'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
CELERY_IMPORTS = ("api.tasks",)
...
I run celery -A blogpodapi worker --loglevel=info in command prompt and get the following:
D:\blogpodapi>celery -A blogpodapi worker --loglevel=info
-------------- celery#JM v3.1.23 (Cipater)
---- **** -----
--- * *** * -- Windows-8-6.2.9200
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: blogpodapi:0x348a940
- ** ---------- .> transport: redis://localhost:6379/0
- ** ---------- .> results: redis://localhost:6379/1
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. api.tasks.add
. api.tasks.addrandom
. blogpodapi.celery.debug_task
[2016-08-13 13:01:51,108: INFO/MainProcess] Connected to redis://localhost:6379/0
[2016-08-13 13:01:52,122: INFO/MainProcess] mingle: searching for neighbors
[2016-08-13 13:01:55,138: INFO/MainProcess] mingle: all alone
c:\python27\lib\site-packages\celery\fixups\django.py:265: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2016-08-13 13:02:00,157: WARNING/MainProcess] c:\python27\lib\site-packages\celery\fixups\django.py:265: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2016-08-13 13:02:27,790: WARNING/MainProcess] celery#JM ready.
I then run celery -A blogpodapi beat in command prompt and get the following:
D:\blogpodapi>celery -A blogpodapi beat
celery beat v3.1.23 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> redis://localhost:6379/0
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> now (0s)
[2016-08-13 13:02:51,937: INFO/MainProcess] beat: Starting...
For some reason, I can't seem to view my periodic tasks being logged. Is there anything I am doing wrong?
UPDATE: here is my celery.py...
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'blogpodapi.settings')
from django.conf import settings # noqa
app = Celery('blogpodapi')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
You need to run celery beat with the celery settings file
celery -A blogpodapi.celery beat --loglevel=INFO
Related
I am building a lead generation portal that can be accessed online. Please don't mind the verbosity of the code, I'm doing a lot of debugging right now.
My Celery worker inconsistently picks up tasks assigned to it, and I'm not sure why.
The weird thing about this, is that sometimes it works 100% perfect: there never are any explicit errors in the terminal.
I am currently in DEBUG = TRUE and REDIS as a broker!
celery start worker terminal command and response
celery -A mysite worker -l info --pool=solo
-------------- celery#DESKTOP-OG8ENRQ v5.0.2 (singularity)
--- ***** -----
-- ******* ---- Windows-10-10.0.19041-SP0 2020-11-09 00:36:13
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: mysite:0x41ba490
- ** ---------- .> transport: redis://localhost:6379//
- ** ---------- .> results: redis://localhost:6379/
- *** --- * --- .> concurrency: 12 (solo)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. mysite.celery.debug_task
. submit
[2020-11-09 00:36:13,899: INFO/MainProcess] Connected to redis://localhost:6379//
[2020-11-09 00:36:14,939: WARNING/MainProcess] c:\users\coole\pycharmprojects\lead_django_retry\venv\lib\site-packages\celery\app\control.py:48: DuplicateNodenameWarning: Received multiple replies from node name: celery#DESKTOP-OG8ENRQ.
Please make sure you give each node a unique nodename using
the celery worker `-n` option.
warnings.warn(DuplicateNodenameWarning(
[2020-11-09 00:36:14,939: INFO/MainProcess] mingle: all alone
[2020-11-09 00:36:14,947: INFO/MainProcess] celery#DESKTOP-OG8ENRQ ready.
views.py
class LeadInputView(FormView):
template_name = 'lead_main.html'
form_class = LeadInput
def form_valid(self, form):
print("I'm at views")
form.submit()
print(form.submit)
return HttpResponseRedirect('./success/')
tasks.py
#task(name="submit")
def start_task(city, category, email):
print("I'm at tasks!")
print(city, category, email)
"""sends an email when feedback form is filled successfully"""
logger.info("Submitted")
return start(city, category, email)
forms.py
class LeadInput(forms.Form):
city = forms.CharField(max_length=50)
category = forms.CharField(max_length=50)
email = forms.EmailField()
def submit(self):
print("I'm at forms!")
x = (start_task.delay(self.cleaned_data['city'], self.cleaned_data['category'], self.cleaned_data['email']))
return x
celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')
app = Celery('mysite')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
settings.py
BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
The runserver terminal will look something like this:
I'm at views
I'm at forms!
<bound method LeadInput.submit of <LeadInput bound=True, valid=True, fields=(city;category;email)>>
But the worker doesn't say that it picked up anything, just that "celery#DESKTOP-OG8ENRQ ready." Except, when it does work... for some reason? I'm at a loss!
Hello to whoever sees this. It turns out, that this is a bug with celery (or maybe redis?)... apparently many windows users run into this. https://github.com/celery/celery/issues/3759
Turns out, the answer is to make -P solo when starting worker. I'm not sure why this is the case... but that solved it!
Thank you Naqib for your help! You put me down the right rabbit hole to a solution.
By default, celery will use the hostname as worker name if your willing to use multiple workers in the same host then specify -n option.
celery -A mysite worker -l info --pool=solo -n worker2#%h
Your code works fine but the task is passed to the first worker, see
DuplicateNodenameWarning with no obvious reason #2938
I'm trying to add and remove tasks dynamically from celery beat through a few flask endpoints.
I created a simple project named myApp and a package called flaskr (yes, like the tutorial) with three files in it
myApp
flaskr
__init__.py
routes.py
tasks.py
wsgi.py
This is the endpoint code
#route_blueprint.route('/myApp/add_task')
def add():
print(celery.conf.beat_schedule)
print(hex(id(celery)))
celery.add_periodic_task(10.0, tasks.add.s(55, 2), name='add every 10')
print(celery.conf.beat_schedule)
return ""
I go to the PyCharm console and from one of them I run gunicorn like this:
gunicorn wsgi:app -b localhost:8000
From another console tab I also run Celery like this
celery -A flaskr.celery worker --loglevel=info
And from another I run beat like this
celery -A flaskr.celery beat -l=debug
When I hit the endpoint, in the console I can see the task being added but beat never sends it.
I suspected that flask was setting the task is a differente celery_app instance so I put a print of the celery object that I was trying to modify and yes, it was a different one.
This is from celery start
flaskr:0x110048978
-------------- celery#MacBook-Pro.local v4.3.0 (rhubarb)
---- **** -----
--- * *** * -- Darwin-18.6.0-x86_64-i386-64bit 2019-08-26 17:19:47
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: flaskr:0x110048978
- ** ---------- .> transport: redis://localhost:6379/2
- ** ---------- .> results: redis://localhost:6379/2
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
And this is from the endpoint
0x101e31e80
Question
I'm quite new with python but I guess it makes sense, because i'm triggering the same code from two different processes, one from the celery worker and the other one from flask/gunicorn, so they'll never see each other.
Is there a way to give flask access to the instance initialized from the celery command line instance or should I start the workers from inside flask? (I didn't see that in any documentation from celery nor flask)
This is the full code
__init__.py
from flask import Flask
from celery import Celery
import config
celery = Celery(__name__,
backend=config.CELERY_BACKEND,
broker=config.CELERY_BROKER,
include=['flaskr.tasks'])
#celery.task
def asd(x, y):
print('ADD')
# raise exceptions.Retry(20)
return x + y
def create_app(test_config=None):
# create and configure the app
app = Flask(__name__)
from .routes import route_blueprint
app.register_blueprint(route_blueprint)
return app
tasks.py
from __future__ import absolute_import, unicode_literals
from . import celery
import logging.config
logging.config.fileConfig('logging.conf')
logger = logging.getLogger('myApp')
#celery.task
def add(x, y):
print('ADD')
# raise exceptions.Retry(20)
return x + y
#celery.task(bind=True)
def see_you(self, x, y):
logger.info('Log de see_you')
print(x)
# print("See you in ten seconds!")
print('Initializing from tasks')
print(hex(id(celery)))
print('beat schedule: ' + str(celery.conf.beat_schedule))
# celery.add_periodic_task(10.0, add.s(1, 2), name='add every 10')
# print(str(celery.conf.beat_schedule))
routes.py
from flask import Blueprint
import logging.config
from . import tasks
from . import celery
route_blueprint = Blueprint('route_blueprint', __name__,)
logging.config.fileConfig('logging.conf')
logger = logging.getLogger('myApp')
#route_blueprint.route('/myApp/health')
def health():
return "Health ok"
#route_blueprint.route('/myApp/add_task')
def add():
print(celery.conf.beat_schedule)
# tasks.add.delay(55, 2)
print(hex(id(celery)))
celery.add_periodic_task(10.0, tasks.add.s(55, 2), name='add every 10')
print(celery.conf.beat_schedule)
return "okkk"
I'm new in Celery. I'm trying to properly configure Celery with my Django project. To test whether the celery works, I've created a periodic task which should print "periodic_task" each 2 seconds. Unfortunately it doesn't work but no error.
1 Installed rabbitmq
2 Project/project/celery.py
from __future__ import absolute_import
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings')
from django.conf import settings # noqa
app = Celery('project')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def myfunc():
print 'periodic_task'
#app.task(bind=True)
def debudeg_task(self):
print('Request: {0!r}'.format(self.request))
3 Project/project/__init__.py
from __future__ import absolute_import
from .celery import app as celery_app
4 Settings.py
INSTALLED_APPS = [
'djcelery',
...]
...
...
CELERYBEAT_SCHEDULE = {
'schedule-name': {
'task': 'project.celery.myfunc', # We are going to create a email_sending_method later in this post.
'schedule': timedelta(seconds=2),
},
}
And before python manage.py, I run celery -A project worker -l info
Still can't see any "periodic_task" printed in console every 2 seconds... Do you know what to do?
EDIT CELERY CONSOLE:
-------------- celery#Milwou_NB v3.1.23 (Cipater)
---- **** -----
--- * *** * -- Windows-8-6.2.9200
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: dolava:0x33d1350
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. project.celery.debudeg_task
. project.celery.myfunc
EDIT:
After changing worker to beat, it seems to work. Something is happening each 2 seconds (changed to 5 seconds) but I can't see the results of the task. (I can put anything into the CELERYBEAT_SCHEDULE, even wrong path and it doesn't raises any error..)
I changed myfunc code to:
#app.task(bind=True)
def myfunc():
# notifications.send_message_to_admin('sdaa','dsadasdsa')
with open('text.txt','a') as f:
f.write('sa')
But I can't see text.txt anywhere.
> celery -A dolava beat -l info
celery beat v3.1.23 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> amqp://guest:**#localhost:5672//
. loader -> celery.loaders.app.AppLoader
. scheduler -> djcelery.schedulers.DatabaseScheduler
. logfile -> [stderr]#%INFO
. maxinterval -> now (0s)
[2016-10-26 17:46:50,135: INFO/MainProcess] beat: Starting...
[2016-10-26 17:46:50,138: INFO/MainProcess] Writing entries...
[2016-10-26 17:46:51,433: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2016-10-26 17:46:51,433: INFO/MainProcess] Writing entries...
[2016-10-26 17:46:51,812: INFO/MainProcess] Scheduler: Sending due task schedule-name (dolava_app.tasks.myfunc)
[2016-10-26 17:46:51,864: INFO/MainProcess] Writing entries...
[2016-10-26 17:46:57,138: INFO/MainProcess] Scheduler: Sending due task schedule-name (dolava_app.tasks.myfunc)
[2016-10-26 17:47:02,230: INFO/MainProcess] Scheduler: Sending due task schedule-name (dolava_app.tasks.myfunc)
Try to run
$ celery -A project beat -l info
I have a flask app with celery.
When I am running the worker as follows:
celery -A app.celery worker
I get the following output
-------------- celery#local-pc v3.1.22 (Cipater)
---- **** -----
--- * *** * -- Windows-7-6.1.7601-SP1
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: app:0x483e668
- ** ---------- .> transport: mongodb://localhost:27017/app
- ** ---------- .> results: mongodb://localhost:27017/app
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[2016-03-08 15:52:05,587: WARNING/MainProcess] celery#local-pc ready.
[2016-03-08 15:52:08,855: ERROR/MainProcess] Process 'Worker-8' pid:9720 exited with 'exitcode 1'
[2016-03-08 15:52:08,855: ERROR/MainProcess] Process 'Worker-7' pid:11940 exited with 'exitcode 1'
[2016-03-08 15:52:08,856: ERROR/MainProcess] Process 'Worker-6' pid:13120 exited with 'exitcode 1'
...
It goes endlessly and the CPU raises to 100%.
the relevant configuration is:
CELERY_BROKER_URL = 'mongodb://localhost:27017/app'
CELERY_RESULT_BACKEND = 'mongodb://localhost:27017/'
CELERY_MONGODB_BACKEND_SETTINGS = {
'database': 'app',
'taskmeta_collection': 'my_taskmeta_collection',
}
CELERY_IMPORTS = ('app.tasks', )
CELERYD_FORCE_EXEC = True
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
my project structure is:
proj/
config.py
app/
__init__.py
tasks.py
views.py
this is how I configured my celery in ___init___.py":
app = Flask(__name__)
app.config.from_object('config')
db = SQLAlchemy(app)
def make_celery(app):
celery = Celery(app.import_name, broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
celery = make_celery(app)
this is what I have in tasks.py
from app import celery
#celery.task()
def add_together(a, b):
return a + b
update
When I remove the following line from the config file the worker does not exit
CELERY_IMPORTS = ('app.tasks', )
but I get the following error
Traceback (most recent call last):
File "d:\python34\lib\site-packages\celery\worker\consumer.py", line 456, in on_task_received
strategies[name](message, body,
KeyError: 'app.tasks.add_together'
In the official Celery web site http://docs.celeryproject.org/en/3.1/getting-started/brokers/mongodb.html
States that when using mongoDB as a broker, you are doing it at your own risk.
Using MongoDB
Experimental Status
The MongoDB transport is in need of improvements in many areas and there are several open bugs. Unfortunately we don’t have the resources or funds required to improve the situation, so we’re looking for contributors and partners willing to help.
To test if that is not what generates your problem, try to use redis, or rabbitMQ as brokers.
My periodic task is never executed. What am I missing? I have the RabbitMQ service running. I also have flower running and the celery worker is showing up there.
I find it frustrating that there are a ton of examples on how to use celery with Django, but most of those are old versions which I believe don't apply to the latest release.
celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'solar_secured.settings')
app = Celery('solar_secured', broker='amqp://guest#localhost//')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
tasks.py
from asset_monitor.process_raw import parse_rawdata
from datetime import timedelta
from celery.task import periodic_task
#periodic_task(run_every=timedelta(minutes=5))
def parse_raw():
parse_rawdata()
started celery worker
C:\dev\solar_secured>manage.py celeryd
C:\Python27\lib\site-packages\celery-3.1.6-py2.7.egg\celery\apps\worker.py:159:
CDeprecationWarning:
Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers
the ability to execute any command. It's important to secure
your broker from unauthorized access when using pickle, so we think
that enabling pickle should require a deliberate action and not be
the default choice.
If you depend on pickle then you should set a setting to disable this
warning and to be sure that everything will continue working
when you upgrade to Celery 3.2::
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
You must only enable the serializers that you will actually use.
warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED))
[2013-12-16 11:26:19,302: WARNING/MainProcess] C:\Python27\lib\site-packages\cel
ery-3.1.6-py2.7.egg\celery\apps\worker.py:159: CDeprecationWarning:
Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers
the ability to execute any command. It's important to secure
your broker from unauthorized access when using pickle, so we think
that enabling pickle should require a deliberate action and not be
the default choice.
If you depend on pickle then you should set a setting to disable this
warning and to be sure that everything will continue working
when you upgrade to Celery 3.2::
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
You must only enable the serializers that you will actually use.
warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED))
-------------- celery#DjangoDev v3.1.6 (Cipater)
---- **** -----
--- * *** * -- Windows-2008ServerR2-6.1.7601-SP1
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> broker: amqp://guest#localhost:5672//
- ** ---------- .> app: solar_secured:0x2cd4b00
- ** ---------- .> concurrency: 8 (prefork)
- *** --- * --- .> events: OFF (enable -E to monitor this worker)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
C:\Python27\lib\site-packages\celery-3.1.6-py2.7.egg\celery\fixups\django.py:224
: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setti
ng in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2013-12-16 11:26:20,512: WARNING/MainProcess] C:\Python27\lib\site-packages\cel
ery-3.1.6-py2.7.egg\celery\fixups\django.py:224: UserWarning: Using settings.DEB
UG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2013-12-16 11:26:20,512: WARNING/MainProcess] celery#DjangoDev ready.
started celery beat
C:\dev\solar_secured>manage.py celery beat
celery beat v3.1.6 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> amqp://guest#localhost:5672//
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> now (0s)
[2013-12-16 14:18:48,206: INFO/MainProcess] beat: Starting...
You have to use a scheduler to execute periodic tasks. You can use Celery's Beat which is default scheduler. Use following command to use beat python manage.py celery worker -B