Celery - worker only sometimes picks up tasks - python

I am building a lead generation portal that can be accessed online. Please don't mind the verbosity of the code, I'm doing a lot of debugging right now.
My Celery worker inconsistently picks up tasks assigned to it, and I'm not sure why.
The weird thing about this, is that sometimes it works 100% perfect: there never are any explicit errors in the terminal.
I am currently in DEBUG = TRUE and REDIS as a broker!
celery start worker terminal command and response
celery -A mysite worker -l info --pool=solo
-------------- celery#DESKTOP-OG8ENRQ v5.0.2 (singularity)
--- ***** -----
-- ******* ---- Windows-10-10.0.19041-SP0 2020-11-09 00:36:13
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: mysite:0x41ba490
- ** ---------- .> transport: redis://localhost:6379//
- ** ---------- .> results: redis://localhost:6379/
- *** --- * --- .> concurrency: 12 (solo)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. mysite.celery.debug_task
. submit
[2020-11-09 00:36:13,899: INFO/MainProcess] Connected to redis://localhost:6379//
[2020-11-09 00:36:14,939: WARNING/MainProcess] c:\users\coole\pycharmprojects\lead_django_retry\venv\lib\site-packages\celery\app\control.py:48: DuplicateNodenameWarning: Received multiple replies from node name: celery#DESKTOP-OG8ENRQ.
Please make sure you give each node a unique nodename using
the celery worker `-n` option.
warnings.warn(DuplicateNodenameWarning(
[2020-11-09 00:36:14,939: INFO/MainProcess] mingle: all alone
[2020-11-09 00:36:14,947: INFO/MainProcess] celery#DESKTOP-OG8ENRQ ready.
views.py
class LeadInputView(FormView):
template_name = 'lead_main.html'
form_class = LeadInput
def form_valid(self, form):
print("I'm at views")
form.submit()
print(form.submit)
return HttpResponseRedirect('./success/')
tasks.py
#task(name="submit")
def start_task(city, category, email):
print("I'm at tasks!")
print(city, category, email)
"""sends an email when feedback form is filled successfully"""
logger.info("Submitted")
return start(city, category, email)
forms.py
class LeadInput(forms.Form):
city = forms.CharField(max_length=50)
category = forms.CharField(max_length=50)
email = forms.EmailField()
def submit(self):
print("I'm at forms!")
x = (start_task.delay(self.cleaned_data['city'], self.cleaned_data['category'], self.cleaned_data['email']))
return x
celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')
app = Celery('mysite')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
settings.py
BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
The runserver terminal will look something like this:
I'm at views
I'm at forms!
<bound method LeadInput.submit of <LeadInput bound=True, valid=True, fields=(city;category;email)>>
But the worker doesn't say that it picked up anything, just that "celery#DESKTOP-OG8ENRQ ready." Except, when it does work... for some reason? I'm at a loss!

Hello to whoever sees this. It turns out, that this is a bug with celery (or maybe redis?)... apparently many windows users run into this. https://github.com/celery/celery/issues/3759
Turns out, the answer is to make -P solo when starting worker. I'm not sure why this is the case... but that solved it!
Thank you Naqib for your help! You put me down the right rabbit hole to a solution.

By default, celery will use the hostname as worker name if your willing to use multiple workers in the same host then specify -n option.
celery -A mysite worker -l info --pool=solo -n worker2#%h
Your code works fine but the task is passed to the first worker, see
DuplicateNodenameWarning with no obvious reason #2938

Related

Celery and flask, same celery app instance

I'm trying to add and remove tasks dynamically from celery beat through a few flask endpoints.
I created a simple project named myApp and a package called flaskr (yes, like the tutorial) with three files in it
myApp
flaskr
__init__.py
routes.py
tasks.py
wsgi.py
This is the endpoint code
#route_blueprint.route('/myApp/add_task')
def add():
print(celery.conf.beat_schedule)
print(hex(id(celery)))
celery.add_periodic_task(10.0, tasks.add.s(55, 2), name='add every 10')
print(celery.conf.beat_schedule)
return ""
I go to the PyCharm console and from one of them I run gunicorn like this:
gunicorn wsgi:app -b localhost:8000
From another console tab I also run Celery like this
celery -A flaskr.celery worker --loglevel=info
And from another I run beat like this
celery -A flaskr.celery beat -l=debug
When I hit the endpoint, in the console I can see the task being added but beat never sends it.
I suspected that flask was setting the task is a differente celery_app instance so I put a print of the celery object that I was trying to modify and yes, it was a different one.
This is from celery start
flaskr:0x110048978
-------------- celery#MacBook-Pro.local v4.3.0 (rhubarb)
---- **** -----
--- * *** * -- Darwin-18.6.0-x86_64-i386-64bit 2019-08-26 17:19:47
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: flaskr:0x110048978
- ** ---------- .> transport: redis://localhost:6379/2
- ** ---------- .> results: redis://localhost:6379/2
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
And this is from the endpoint
0x101e31e80
Question
I'm quite new with python but I guess it makes sense, because i'm triggering the same code from two different processes, one from the celery worker and the other one from flask/gunicorn, so they'll never see each other.
Is there a way to give flask access to the instance initialized from the celery command line instance or should I start the workers from inside flask? (I didn't see that in any documentation from celery nor flask)
This is the full code
__init__.py
from flask import Flask
from celery import Celery
import config
celery = Celery(__name__,
backend=config.CELERY_BACKEND,
broker=config.CELERY_BROKER,
include=['flaskr.tasks'])
#celery.task
def asd(x, y):
print('ADD')
# raise exceptions.Retry(20)
return x + y
def create_app(test_config=None):
# create and configure the app
app = Flask(__name__)
from .routes import route_blueprint
app.register_blueprint(route_blueprint)
return app
tasks.py
from __future__ import absolute_import, unicode_literals
from . import celery
import logging.config
logging.config.fileConfig('logging.conf')
logger = logging.getLogger('myApp')
#celery.task
def add(x, y):
print('ADD')
# raise exceptions.Retry(20)
return x + y
#celery.task(bind=True)
def see_you(self, x, y):
logger.info('Log de see_you')
print(x)
# print("See you in ten seconds!")
print('Initializing from tasks')
print(hex(id(celery)))
print('beat schedule: ' + str(celery.conf.beat_schedule))
# celery.add_periodic_task(10.0, add.s(1, 2), name='add every 10')
# print(str(celery.conf.beat_schedule))
routes.py
from flask import Blueprint
import logging.config
from . import tasks
from . import celery
route_blueprint = Blueprint('route_blueprint', __name__,)
logging.config.fileConfig('logging.conf')
logger = logging.getLogger('myApp')
#route_blueprint.route('/myApp/health')
def health():
return "Health ok"
#route_blueprint.route('/myApp/add_task')
def add():
print(celery.conf.beat_schedule)
# tasks.add.delay(55, 2)
print(hex(id(celery)))
celery.add_periodic_task(10.0, tasks.add.s(55, 2), name='add every 10')
print(celery.conf.beat_schedule)
return "okkk"

Celery not queuing to a remote broker, adding tasks to a localhost instead

My question is same like this Celery not queuing tasks to broker on remote server, adds tasks to localhost instead, but the answer is not working to me.
My celery.py
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings')
app = Celery('project', broker='amqp://<user>:<user_pass>#remoteserver:5672/<vhost>', backend='amqp')
# app = Celery('project')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
When I run:
$ celery -A worker -l info
I receive the following output:
-------------- celery#paulo-Inspiron-3420 v4.2.1 (windowlicker)
---- **** -----
--- * *** * -- Linux-4.15.0-36-generic-x86_64-with-Ubuntu-18.04-bionic 2018-10-30 13:44:07
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: mycelery:0x7ff88ca043c8
- ** ---------- .> transport: amqp://<user>:**#<remote_ip>:5672/<vhost>
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
I tried stop rabbitmq server and uninstalled it too, but celery keeps queuing to localhost.
Someone can help?
You need to add something like this to your __init__.py file in the same directory as the celery.py file:
from __future__ import absolute_import, unicode_literals
from .celery import app as celery_app
__all__ = ('celery_app',)
Also, make sure you're starting the worker process from inside your project's virtualenv.

Django + Celery + SQS setup. Celery connects to default RabbitMQ via ampq

I am trying to setup Amazon SQS as a default message broker for Celery in Django app. Celery worker is starting but broker is set to default RabbitMQ. Below you can find the output of my worker.
Here are some configs which I have in the project. My celery.py looks like:
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'dance.settings')
app = Celery('dance')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
The essential part of the Django celery settings responsible for setup of broker url is:
BROKER_URL = 'sqs://{}:{}#'.format(AWS_ACCESS_KEY_ID, quote(AWS_SECRET_ACCESS_KEY, safe=''))
BROKER_TRANSPORT_OPTIONS = {
'region': 'eu-west-1',
'polling_interval': 3,
'visibility_timeout': 300,
'queue_name_prefix':'dev-celery-',
}
When I am trying to launch worker within virtual environment with:
celery -A dance worker -l info
I receive following output:
-------------- celery#universe v4.0.0 (latentcall)
---- **** -----
--- * *** * -- Linux-4.8.0-28-generic-x86_64-with-debian-stretch-sid 2016-12-02 14:20:40
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: dance:0x7fdc592ca9e8
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results:
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
...
task1
task2
...
Tasks are listed so I guess Celery gets and processes related Django settings. If to switch in settings from SQS to Redis, I get the same problem.
As I understand from read tutorials worker's output should look similar to.
- ** ---------- .> transport: sqs://*redacted*:**#localhost//
- ** ---------- .> results: djcelery.backends.database:DatabaseBackend
Also I am not using djcelery as far as it is outdated. Instead I am using django_celery_results as it is recommended on Celery setup pages. The last output is just a guess from side project.
The only possible solution which I have found is to explicitly specify broker and database backend.
For me it looks strange, because settings from Django settings.py are not fully loaded or probably I have missed something, otherwise it is bug of Celery.
app = Celery('dance', broker='sqs://', backend='django-db')
Real solution:
Here is why I had problems:
All the Celery variables in Django should start with CELERY so instead of using BROKER_URL and BROKER_TRANSPORT_OPTIONS I had to use CELERY_BROKER_URL and CELERY_BROKER_TRANSPORT_OPTIONS
Incorrect: you need to use CELERY_BROKER_URL when you use CELERY namespace. But some options by default go with CELERY prefix, for example, CELERY_RESULT_BACKEND. If you use CELERY namespace so you need to write CELERY_CELERY_RESULT_BACKEND.

Simple periodic task in celery doesn't work but no errors

I'm new in Celery. I'm trying to properly configure Celery with my Django project. To test whether the celery works, I've created a periodic task which should print "periodic_task" each 2 seconds. Unfortunately it doesn't work but no error.
1 Installed rabbitmq
2 Project/project/celery.py
from __future__ import absolute_import
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings')
from django.conf import settings # noqa
app = Celery('project')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def myfunc():
print 'periodic_task'
#app.task(bind=True)
def debudeg_task(self):
print('Request: {0!r}'.format(self.request))
3 Project/project/__init__.py
from __future__ import absolute_import
from .celery import app as celery_app
4 Settings.py
INSTALLED_APPS = [
'djcelery',
...]
...
...
CELERYBEAT_SCHEDULE = {
'schedule-name': {
'task': 'project.celery.myfunc', # We are going to create a email_sending_method later in this post.
'schedule': timedelta(seconds=2),
},
}
And before python manage.py, I run celery -A project worker -l info
Still can't see any "periodic_task" printed in console every 2 seconds... Do you know what to do?
EDIT CELERY CONSOLE:
-------------- celery#Milwou_NB v3.1.23 (Cipater)
---- **** -----
--- * *** * -- Windows-8-6.2.9200
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: dolava:0x33d1350
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. project.celery.debudeg_task
. project.celery.myfunc
EDIT:
After changing worker to beat, it seems to work. Something is happening each 2 seconds (changed to 5 seconds) but I can't see the results of the task. (I can put anything into the CELERYBEAT_SCHEDULE, even wrong path and it doesn't raises any error..)
I changed myfunc code to:
#app.task(bind=True)
def myfunc():
# notifications.send_message_to_admin('sdaa','dsadasdsa')
with open('text.txt','a') as f:
f.write('sa')
But I can't see text.txt anywhere.
> celery -A dolava beat -l info
celery beat v3.1.23 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> amqp://guest:**#localhost:5672//
. loader -> celery.loaders.app.AppLoader
. scheduler -> djcelery.schedulers.DatabaseScheduler
. logfile -> [stderr]#%INFO
. maxinterval -> now (0s)
[2016-10-26 17:46:50,135: INFO/MainProcess] beat: Starting...
[2016-10-26 17:46:50,138: INFO/MainProcess] Writing entries...
[2016-10-26 17:46:51,433: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2016-10-26 17:46:51,433: INFO/MainProcess] Writing entries...
[2016-10-26 17:46:51,812: INFO/MainProcess] Scheduler: Sending due task schedule-name (dolava_app.tasks.myfunc)
[2016-10-26 17:46:51,864: INFO/MainProcess] Writing entries...
[2016-10-26 17:46:57,138: INFO/MainProcess] Scheduler: Sending due task schedule-name (dolava_app.tasks.myfunc)
[2016-10-26 17:47:02,230: INFO/MainProcess] Scheduler: Sending due task schedule-name (dolava_app.tasks.myfunc)
Try to run
$ celery -A project beat -l info

Flask with Celery - workers exit with exitcode1

I have a flask app with celery.
When I am running the worker as follows:
celery -A app.celery worker
I get the following output
-------------- celery#local-pc v3.1.22 (Cipater)
---- **** -----
--- * *** * -- Windows-7-6.1.7601-SP1
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: app:0x483e668
- ** ---------- .> transport: mongodb://localhost:27017/app
- ** ---------- .> results: mongodb://localhost:27017/app
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[2016-03-08 15:52:05,587: WARNING/MainProcess] celery#local-pc ready.
[2016-03-08 15:52:08,855: ERROR/MainProcess] Process 'Worker-8' pid:9720 exited with 'exitcode 1'
[2016-03-08 15:52:08,855: ERROR/MainProcess] Process 'Worker-7' pid:11940 exited with 'exitcode 1'
[2016-03-08 15:52:08,856: ERROR/MainProcess] Process 'Worker-6' pid:13120 exited with 'exitcode 1'
...
It goes endlessly and the CPU raises to 100%.
the relevant configuration is:
CELERY_BROKER_URL = 'mongodb://localhost:27017/app'
CELERY_RESULT_BACKEND = 'mongodb://localhost:27017/'
CELERY_MONGODB_BACKEND_SETTINGS = {
'database': 'app',
'taskmeta_collection': 'my_taskmeta_collection',
}
CELERY_IMPORTS = ('app.tasks', )
CELERYD_FORCE_EXEC = True
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
my project structure is:
proj/
config.py
app/
__init__.py
tasks.py
views.py
this is how I configured my celery in ___init___.py":
app = Flask(__name__)
app.config.from_object('config')
db = SQLAlchemy(app)
def make_celery(app):
celery = Celery(app.import_name, broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
celery = make_celery(app)
this is what I have in tasks.py
from app import celery
#celery.task()
def add_together(a, b):
return a + b
update
When I remove the following line from the config file the worker does not exit
CELERY_IMPORTS = ('app.tasks', )
but I get the following error
Traceback (most recent call last):
File "d:\python34\lib\site-packages\celery\worker\consumer.py", line 456, in on_task_received
strategies[name](message, body,
KeyError: 'app.tasks.add_together'
In the official Celery web site http://docs.celeryproject.org/en/3.1/getting-started/brokers/mongodb.html
States that when using mongoDB as a broker, you are doing it at your own risk.
Using MongoDB
Experimental Status
The MongoDB transport is in need of improvements in many areas and there are several open bugs. Unfortunately we don’t have the resources or funds required to improve the situation, so we’re looking for contributors and partners willing to help.
To test if that is not what generates your problem, try to use redis, or rabbitMQ as brokers.

Categories

Resources