Celery and flask, same celery app instance - python

I'm trying to add and remove tasks dynamically from celery beat through a few flask endpoints.
I created a simple project named myApp and a package called flaskr (yes, like the tutorial) with three files in it
myApp
flaskr
__init__.py
routes.py
tasks.py
wsgi.py
This is the endpoint code
#route_blueprint.route('/myApp/add_task')
def add():
print(celery.conf.beat_schedule)
print(hex(id(celery)))
celery.add_periodic_task(10.0, tasks.add.s(55, 2), name='add every 10')
print(celery.conf.beat_schedule)
return ""
I go to the PyCharm console and from one of them I run gunicorn like this:
gunicorn wsgi:app -b localhost:8000
From another console tab I also run Celery like this
celery -A flaskr.celery worker --loglevel=info
And from another I run beat like this
celery -A flaskr.celery beat -l=debug
When I hit the endpoint, in the console I can see the task being added but beat never sends it.
I suspected that flask was setting the task is a differente celery_app instance so I put a print of the celery object that I was trying to modify and yes, it was a different one.
This is from celery start
flaskr:0x110048978
-------------- celery#MacBook-Pro.local v4.3.0 (rhubarb)
---- **** -----
--- * *** * -- Darwin-18.6.0-x86_64-i386-64bit 2019-08-26 17:19:47
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: flaskr:0x110048978
- ** ---------- .> transport: redis://localhost:6379/2
- ** ---------- .> results: redis://localhost:6379/2
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
And this is from the endpoint
0x101e31e80
Question
I'm quite new with python but I guess it makes sense, because i'm triggering the same code from two different processes, one from the celery worker and the other one from flask/gunicorn, so they'll never see each other.
Is there a way to give flask access to the instance initialized from the celery command line instance or should I start the workers from inside flask? (I didn't see that in any documentation from celery nor flask)
This is the full code
__init__.py
from flask import Flask
from celery import Celery
import config
celery = Celery(__name__,
backend=config.CELERY_BACKEND,
broker=config.CELERY_BROKER,
include=['flaskr.tasks'])
#celery.task
def asd(x, y):
print('ADD')
# raise exceptions.Retry(20)
return x + y
def create_app(test_config=None):
# create and configure the app
app = Flask(__name__)
from .routes import route_blueprint
app.register_blueprint(route_blueprint)
return app
tasks.py
from __future__ import absolute_import, unicode_literals
from . import celery
import logging.config
logging.config.fileConfig('logging.conf')
logger = logging.getLogger('myApp')
#celery.task
def add(x, y):
print('ADD')
# raise exceptions.Retry(20)
return x + y
#celery.task(bind=True)
def see_you(self, x, y):
logger.info('Log de see_you')
print(x)
# print("See you in ten seconds!")
print('Initializing from tasks')
print(hex(id(celery)))
print('beat schedule: ' + str(celery.conf.beat_schedule))
# celery.add_periodic_task(10.0, add.s(1, 2), name='add every 10')
# print(str(celery.conf.beat_schedule))
routes.py
from flask import Blueprint
import logging.config
from . import tasks
from . import celery
route_blueprint = Blueprint('route_blueprint', __name__,)
logging.config.fileConfig('logging.conf')
logger = logging.getLogger('myApp')
#route_blueprint.route('/myApp/health')
def health():
return "Health ok"
#route_blueprint.route('/myApp/add_task')
def add():
print(celery.conf.beat_schedule)
# tasks.add.delay(55, 2)
print(hex(id(celery)))
celery.add_periodic_task(10.0, tasks.add.s(55, 2), name='add every 10')
print(celery.conf.beat_schedule)
return "okkk"

Related

Celery - worker only sometimes picks up tasks

I am building a lead generation portal that can be accessed online. Please don't mind the verbosity of the code, I'm doing a lot of debugging right now.
My Celery worker inconsistently picks up tasks assigned to it, and I'm not sure why.
The weird thing about this, is that sometimes it works 100% perfect: there never are any explicit errors in the terminal.
I am currently in DEBUG = TRUE and REDIS as a broker!
celery start worker terminal command and response
celery -A mysite worker -l info --pool=solo
-------------- celery#DESKTOP-OG8ENRQ v5.0.2 (singularity)
--- ***** -----
-- ******* ---- Windows-10-10.0.19041-SP0 2020-11-09 00:36:13
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: mysite:0x41ba490
- ** ---------- .> transport: redis://localhost:6379//
- ** ---------- .> results: redis://localhost:6379/
- *** --- * --- .> concurrency: 12 (solo)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. mysite.celery.debug_task
. submit
[2020-11-09 00:36:13,899: INFO/MainProcess] Connected to redis://localhost:6379//
[2020-11-09 00:36:14,939: WARNING/MainProcess] c:\users\coole\pycharmprojects\lead_django_retry\venv\lib\site-packages\celery\app\control.py:48: DuplicateNodenameWarning: Received multiple replies from node name: celery#DESKTOP-OG8ENRQ.
Please make sure you give each node a unique nodename using
the celery worker `-n` option.
warnings.warn(DuplicateNodenameWarning(
[2020-11-09 00:36:14,939: INFO/MainProcess] mingle: all alone
[2020-11-09 00:36:14,947: INFO/MainProcess] celery#DESKTOP-OG8ENRQ ready.
views.py
class LeadInputView(FormView):
template_name = 'lead_main.html'
form_class = LeadInput
def form_valid(self, form):
print("I'm at views")
form.submit()
print(form.submit)
return HttpResponseRedirect('./success/')
tasks.py
#task(name="submit")
def start_task(city, category, email):
print("I'm at tasks!")
print(city, category, email)
"""sends an email when feedback form is filled successfully"""
logger.info("Submitted")
return start(city, category, email)
forms.py
class LeadInput(forms.Form):
city = forms.CharField(max_length=50)
category = forms.CharField(max_length=50)
email = forms.EmailField()
def submit(self):
print("I'm at forms!")
x = (start_task.delay(self.cleaned_data['city'], self.cleaned_data['category'], self.cleaned_data['email']))
return x
celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')
app = Celery('mysite')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
settings.py
BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
The runserver terminal will look something like this:
I'm at views
I'm at forms!
<bound method LeadInput.submit of <LeadInput bound=True, valid=True, fields=(city;category;email)>>
But the worker doesn't say that it picked up anything, just that "celery#DESKTOP-OG8ENRQ ready." Except, when it does work... for some reason? I'm at a loss!
Hello to whoever sees this. It turns out, that this is a bug with celery (or maybe redis?)... apparently many windows users run into this. https://github.com/celery/celery/issues/3759
Turns out, the answer is to make -P solo when starting worker. I'm not sure why this is the case... but that solved it!
Thank you Naqib for your help! You put me down the right rabbit hole to a solution.
By default, celery will use the hostname as worker name if your willing to use multiple workers in the same host then specify -n option.
celery -A mysite worker -l info --pool=solo -n worker2#%h
Your code works fine but the task is passed to the first worker, see
DuplicateNodenameWarning with no obvious reason #2938

Celery not queuing to a remote broker, adding tasks to a localhost instead

My question is same like this Celery not queuing tasks to broker on remote server, adds tasks to localhost instead, but the answer is not working to me.
My celery.py
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings')
app = Celery('project', broker='amqp://<user>:<user_pass>#remoteserver:5672/<vhost>', backend='amqp')
# app = Celery('project')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
When I run:
$ celery -A worker -l info
I receive the following output:
-------------- celery#paulo-Inspiron-3420 v4.2.1 (windowlicker)
---- **** -----
--- * *** * -- Linux-4.15.0-36-generic-x86_64-with-Ubuntu-18.04-bionic 2018-10-30 13:44:07
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: mycelery:0x7ff88ca043c8
- ** ---------- .> transport: amqp://<user>:**#<remote_ip>:5672/<vhost>
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
I tried stop rabbitmq server and uninstalled it too, but celery keeps queuing to localhost.
Someone can help?
You need to add something like this to your __init__.py file in the same directory as the celery.py file:
from __future__ import absolute_import, unicode_literals
from .celery import app as celery_app
__all__ = ('celery_app',)
Also, make sure you're starting the worker process from inside your project's virtualenv.

Celery using 'application/x-python-serialize' instead of `application/json`

I'm using the celery module v. 3.1.25 in Python 2.7 and Windows 10 to run a Celery worker. The results must be returned encoded in json and not pickle.
Problem: When the worker returns the result of a task, RabbitMQ management console shows the results to be content_type: application/x-python-serialize. Why is it still x-python-serialize when we have set task_serializer, result_serializer and accept_content to json?
proj/celery.py
from __future__ import absolute_import, unicode_literals
from celery import Celery
app = Celery('tasks',
broker='amqp://test:test#192.168.1.26:5672//', # running in Win10 VM
backend='amqp://',
task_serializer='json',
result_serializer='json',
accept_content=['application/json'],
include=['proj.tasks'])
proj/tasks.py
from __future__ import absolute_import, unicode_literals
from .celery import app
#app.task
def myTask():
...
return ...
Worker is started using
celery -A proj worker --loglevel=info
and gives a warning about the pickle serializer
Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers
the ability to execute any command. It's important to secure
your broker from unauthorized access when using pickle, so we think
that enabling pickle should require a deliberate action and not be
the default choice.
If you depend on pickle then you should set a setting to disable this
warning and to be sure that everything will continue working
when you upgrade to Celery 3.2::
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
You must only enable the serializers that you will actually use.
warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED))
-------------- celery#Y-PC v3.1.25 (Cipater)
---- **** -----
--- * *** * -- Windows-10-10.0.14393
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: tasks:0x40ffeb8
- ** ---------- .> transport: amqp://test:**#192.168.1.26:5672//
- ** ---------- .> results: amqp://
- *** --- * --- .> concurrency: 12 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
Does it help to change your Celery() config parameter to accept_content=['json'], instead of application/json?

Django + Celery + SQS setup. Celery connects to default RabbitMQ via ampq

I am trying to setup Amazon SQS as a default message broker for Celery in Django app. Celery worker is starting but broker is set to default RabbitMQ. Below you can find the output of my worker.
Here are some configs which I have in the project. My celery.py looks like:
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'dance.settings')
app = Celery('dance')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
The essential part of the Django celery settings responsible for setup of broker url is:
BROKER_URL = 'sqs://{}:{}#'.format(AWS_ACCESS_KEY_ID, quote(AWS_SECRET_ACCESS_KEY, safe=''))
BROKER_TRANSPORT_OPTIONS = {
'region': 'eu-west-1',
'polling_interval': 3,
'visibility_timeout': 300,
'queue_name_prefix':'dev-celery-',
}
When I am trying to launch worker within virtual environment with:
celery -A dance worker -l info
I receive following output:
-------------- celery#universe v4.0.0 (latentcall)
---- **** -----
--- * *** * -- Linux-4.8.0-28-generic-x86_64-with-debian-stretch-sid 2016-12-02 14:20:40
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: dance:0x7fdc592ca9e8
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results:
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
...
task1
task2
...
Tasks are listed so I guess Celery gets and processes related Django settings. If to switch in settings from SQS to Redis, I get the same problem.
As I understand from read tutorials worker's output should look similar to.
- ** ---------- .> transport: sqs://*redacted*:**#localhost//
- ** ---------- .> results: djcelery.backends.database:DatabaseBackend
Also I am not using djcelery as far as it is outdated. Instead I am using django_celery_results as it is recommended on Celery setup pages. The last output is just a guess from side project.
The only possible solution which I have found is to explicitly specify broker and database backend.
For me it looks strange, because settings from Django settings.py are not fully loaded or probably I have missed something, otherwise it is bug of Celery.
app = Celery('dance', broker='sqs://', backend='django-db')
Real solution:
Here is why I had problems:
All the Celery variables in Django should start with CELERY so instead of using BROKER_URL and BROKER_TRANSPORT_OPTIONS I had to use CELERY_BROKER_URL and CELERY_BROKER_TRANSPORT_OPTIONS
Incorrect: you need to use CELERY_BROKER_URL when you use CELERY namespace. But some options by default go with CELERY prefix, for example, CELERY_RESULT_BACKEND. If you use CELERY namespace so you need to write CELERY_CELERY_RESULT_BACKEND.

Python django rabbitmq celery problems with importing tasks

Welcome... I'm creating a project where I parse xlsx files with xlrd library. Everything works just fine. Then I configured RabbitMQ and Celery. Created some tasks in main folder which works and can be accessed from iPython. The problems starts when I'm in my application (application created back in time in my project) and I try to import tasks from my app in my views.py
I tried to import it with all possible paths but everytime it throws me an error.
Official documentation posts the right way of importing tasks from other applications, It looks like this:
from project.myapp.tasks import mytask
But it doesn't work at all.
In addition when Im in iPython I can import tasks with command from tango.tasks import add
And it works perfectly.
Just bellow I'm uploading my files and error printed out by console.
views.py
# these are the instances that I was trying to import that seemed to be the most reasonable, but non of it worked
# import tasks
# from new_tango_project.tango.tasks import add
# from new_tango_project.tango import tasks
# from new_tango_project.new_tango_project.tango.tasks import add
# from new_tango_project.new_tango_project.tango import tasks
# from tango import tasks
#function to parse files
def parse_file(request, file_id):
xlrd_file = get_object_or_404(xlrdFile, pk = file_id)
if xlrd_file.status == False
#this is some basic task that I want to enter to
tasks.add.delay(321,123)
settings.py
#I've just posted things directly connected to celery
import djcelery
INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'tango',
'djcelery',
'celery',
)
BROKER_URL = "amqp://sebrabbit:seb#localhost:5672/myvhost"
BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "myvhost"
BROKER_USER = "sebrabbit"
BROKER_PASSWORD = "seb"
CELERY_RESULT_BACKEND = 'amqp://'
CELERY_TASK_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT=['json']
CELERY_TIMEZONE = 'Europe/Warsaw'
CELERY_ENABLE_UTC = False
celery.py (in my main folder new_tango_project )
from __future__ import absolute_import
import os
from celery import Celery
import djcelery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'new_tango_project.settings')
app = Celery('new_tango_project')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
# CELERY_IMPORTS = ['tango.tasks']
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_TASK_RESULT_EXPIRES=3600,
CELERY_RESULT_BACKEND='djcelery.backends.cache:CacheBackend',
)
if __name__ == '__main__':
app.start()
tasks.py (in my main project folder new_tango_project)
from __future__ import absolute_import
from celery import Celery
from celery.task import task
app = Celery('new_tango_project',
broker='amqp://sebrabbit:seb#localhost:5672/myvhost',
backend='amqp://',
include=['tasks'])
#task
def add(x, y):
return x + y
#task
def mul(x, y):
return x * y
#task
def xsum(numbers):
return sum(numbers)
#task
def parse(file_id, xlrd_file):
return "HAHAHAHHHAHHA"
tasks.py in my application folder
from __future__ import absolute_import
from celery import Celery
from celery.task import task
#
app = Celery('tango')
#task
def add(x, y):
return x + y
#task
def asdasdasd(x, y):
return x + y
celery console when starting
-------------- celery#debian v3.1.17 (Cipater)
---- **** -----
--- * *** * -- Linux-3.2.0-4-amd64-x86_64-with-debian-7.8
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: new_tango_project:0x1b746d0
- ** ---------- .> transport: amqp://sebrabbit:**#localhost:5672/myvhost
- ** ---------- .> results: amqp://
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
Finally my console log...
[2015-02-20 11:19:45,678: ERROR/MainProcess] Received unregistered task of type 'new_tango_project.tasks.add'.
The message has been ignored and discarded.
Did you remember to import the module containing this task?
Or maybe you are using relative imports?
Please see http://bit.ly/gLye1c for more information.
The full contents of the message body was:
{'utc': True, 'chord': None, 'args': (123123123, 123213213), 'retries': 0, 'expires': None, 'task': 'new_tango_project.tasks.add', 'callbacks': None, 'errbacks': None, 'timelimit': (None, None), 'taskset': None, 'kwargs': {}, 'eta': None, 'id': 'd9a8e560-1cd0-491d-a132-10345a04f391'} (233b)
Traceback (most recent call last):
File "/home/seb/PycharmProjects/tango/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 455, in on_task_received
strategies[name](message, body,
KeyError: 'new_tango_project.tasks.add'
This is the log from one of many tries importing the tasks.
Where I`m making mistake ?
Best wishes
Hint 1: In all your tasks.py you declare your Celery app as app = Celery(...) but you don't specify which app the task should be attached to in your task decorators.
Try to change your #task into #app.task and see if it works.
Hint 2: Why do you need to create a new Celery app in every tasks.py? Why don't you just import one main Celery app with from new_tango_project.celery import app and then declare your tasks with #app.task?
Hint 3: Once you have your tasks defined (possibly both in celery.py and tasks.py in the applications), just do
from new_tango_project.celery import add
from my_app.tasks import add_bis
def my_view(request):
...
add.delay(*your_params) # using the task from your celery.py
add_bis.delay(*your_params) # your task from the application
I wonder how you start up your celery worker. I encounter this once because I didn't start worker right: You should add -A option when execute "celery worker -l info" so that celery will connect to the broker you configured in your Celery Obj. Otherwise celery will try to connect the default broker.

Categories

Resources