Python django rabbitmq celery problems with importing tasks - python

Welcome... I'm creating a project where I parse xlsx files with xlrd library. Everything works just fine. Then I configured RabbitMQ and Celery. Created some tasks in main folder which works and can be accessed from iPython. The problems starts when I'm in my application (application created back in time in my project) and I try to import tasks from my app in my views.py
I tried to import it with all possible paths but everytime it throws me an error.
Official documentation posts the right way of importing tasks from other applications, It looks like this:
from project.myapp.tasks import mytask
But it doesn't work at all.
In addition when Im in iPython I can import tasks with command from tango.tasks import add
And it works perfectly.
Just bellow I'm uploading my files and error printed out by console.
views.py
# these are the instances that I was trying to import that seemed to be the most reasonable, but non of it worked
# import tasks
# from new_tango_project.tango.tasks import add
# from new_tango_project.tango import tasks
# from new_tango_project.new_tango_project.tango.tasks import add
# from new_tango_project.new_tango_project.tango import tasks
# from tango import tasks
#function to parse files
def parse_file(request, file_id):
xlrd_file = get_object_or_404(xlrdFile, pk = file_id)
if xlrd_file.status == False
#this is some basic task that I want to enter to
tasks.add.delay(321,123)
settings.py
#I've just posted things directly connected to celery
import djcelery
INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'tango',
'djcelery',
'celery',
)
BROKER_URL = "amqp://sebrabbit:seb#localhost:5672/myvhost"
BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "myvhost"
BROKER_USER = "sebrabbit"
BROKER_PASSWORD = "seb"
CELERY_RESULT_BACKEND = 'amqp://'
CELERY_TASK_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT=['json']
CELERY_TIMEZONE = 'Europe/Warsaw'
CELERY_ENABLE_UTC = False
celery.py (in my main folder new_tango_project )
from __future__ import absolute_import
import os
from celery import Celery
import djcelery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'new_tango_project.settings')
app = Celery('new_tango_project')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
# CELERY_IMPORTS = ['tango.tasks']
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_TASK_RESULT_EXPIRES=3600,
CELERY_RESULT_BACKEND='djcelery.backends.cache:CacheBackend',
)
if __name__ == '__main__':
app.start()
tasks.py (in my main project folder new_tango_project)
from __future__ import absolute_import
from celery import Celery
from celery.task import task
app = Celery('new_tango_project',
broker='amqp://sebrabbit:seb#localhost:5672/myvhost',
backend='amqp://',
include=['tasks'])
#task
def add(x, y):
return x + y
#task
def mul(x, y):
return x * y
#task
def xsum(numbers):
return sum(numbers)
#task
def parse(file_id, xlrd_file):
return "HAHAHAHHHAHHA"
tasks.py in my application folder
from __future__ import absolute_import
from celery import Celery
from celery.task import task
#
app = Celery('tango')
#task
def add(x, y):
return x + y
#task
def asdasdasd(x, y):
return x + y
celery console when starting
-------------- celery#debian v3.1.17 (Cipater)
---- **** -----
--- * *** * -- Linux-3.2.0-4-amd64-x86_64-with-debian-7.8
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: new_tango_project:0x1b746d0
- ** ---------- .> transport: amqp://sebrabbit:**#localhost:5672/myvhost
- ** ---------- .> results: amqp://
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
Finally my console log...
[2015-02-20 11:19:45,678: ERROR/MainProcess] Received unregistered task of type 'new_tango_project.tasks.add'.
The message has been ignored and discarded.
Did you remember to import the module containing this task?
Or maybe you are using relative imports?
Please see http://bit.ly/gLye1c for more information.
The full contents of the message body was:
{'utc': True, 'chord': None, 'args': (123123123, 123213213), 'retries': 0, 'expires': None, 'task': 'new_tango_project.tasks.add', 'callbacks': None, 'errbacks': None, 'timelimit': (None, None), 'taskset': None, 'kwargs': {}, 'eta': None, 'id': 'd9a8e560-1cd0-491d-a132-10345a04f391'} (233b)
Traceback (most recent call last):
File "/home/seb/PycharmProjects/tango/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 455, in on_task_received
strategies[name](message, body,
KeyError: 'new_tango_project.tasks.add'
This is the log from one of many tries importing the tasks.
Where I`m making mistake ?
Best wishes

Hint 1: In all your tasks.py you declare your Celery app as app = Celery(...) but you don't specify which app the task should be attached to in your task decorators.
Try to change your #task into #app.task and see if it works.
Hint 2: Why do you need to create a new Celery app in every tasks.py? Why don't you just import one main Celery app with from new_tango_project.celery import app and then declare your tasks with #app.task?
Hint 3: Once you have your tasks defined (possibly both in celery.py and tasks.py in the applications), just do
from new_tango_project.celery import add
from my_app.tasks import add_bis
def my_view(request):
...
add.delay(*your_params) # using the task from your celery.py
add_bis.delay(*your_params) # your task from the application

I wonder how you start up your celery worker. I encounter this once because I didn't start worker right: You should add -A option when execute "celery worker -l info" so that celery will connect to the broker you configured in your Celery Obj. Otherwise celery will try to connect the default broker.

Related

Celery - worker only sometimes picks up tasks

I am building a lead generation portal that can be accessed online. Please don't mind the verbosity of the code, I'm doing a lot of debugging right now.
My Celery worker inconsistently picks up tasks assigned to it, and I'm not sure why.
The weird thing about this, is that sometimes it works 100% perfect: there never are any explicit errors in the terminal.
I am currently in DEBUG = TRUE and REDIS as a broker!
celery start worker terminal command and response
celery -A mysite worker -l info --pool=solo
-------------- celery#DESKTOP-OG8ENRQ v5.0.2 (singularity)
--- ***** -----
-- ******* ---- Windows-10-10.0.19041-SP0 2020-11-09 00:36:13
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: mysite:0x41ba490
- ** ---------- .> transport: redis://localhost:6379//
- ** ---------- .> results: redis://localhost:6379/
- *** --- * --- .> concurrency: 12 (solo)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. mysite.celery.debug_task
. submit
[2020-11-09 00:36:13,899: INFO/MainProcess] Connected to redis://localhost:6379//
[2020-11-09 00:36:14,939: WARNING/MainProcess] c:\users\coole\pycharmprojects\lead_django_retry\venv\lib\site-packages\celery\app\control.py:48: DuplicateNodenameWarning: Received multiple replies from node name: celery#DESKTOP-OG8ENRQ.
Please make sure you give each node a unique nodename using
the celery worker `-n` option.
warnings.warn(DuplicateNodenameWarning(
[2020-11-09 00:36:14,939: INFO/MainProcess] mingle: all alone
[2020-11-09 00:36:14,947: INFO/MainProcess] celery#DESKTOP-OG8ENRQ ready.
views.py
class LeadInputView(FormView):
template_name = 'lead_main.html'
form_class = LeadInput
def form_valid(self, form):
print("I'm at views")
form.submit()
print(form.submit)
return HttpResponseRedirect('./success/')
tasks.py
#task(name="submit")
def start_task(city, category, email):
print("I'm at tasks!")
print(city, category, email)
"""sends an email when feedback form is filled successfully"""
logger.info("Submitted")
return start(city, category, email)
forms.py
class LeadInput(forms.Form):
city = forms.CharField(max_length=50)
category = forms.CharField(max_length=50)
email = forms.EmailField()
def submit(self):
print("I'm at forms!")
x = (start_task.delay(self.cleaned_data['city'], self.cleaned_data['category'], self.cleaned_data['email']))
return x
celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')
app = Celery('mysite')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
settings.py
BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
The runserver terminal will look something like this:
I'm at views
I'm at forms!
<bound method LeadInput.submit of <LeadInput bound=True, valid=True, fields=(city;category;email)>>
But the worker doesn't say that it picked up anything, just that "celery#DESKTOP-OG8ENRQ ready." Except, when it does work... for some reason? I'm at a loss!
Hello to whoever sees this. It turns out, that this is a bug with celery (or maybe redis?)... apparently many windows users run into this. https://github.com/celery/celery/issues/3759
Turns out, the answer is to make -P solo when starting worker. I'm not sure why this is the case... but that solved it!
Thank you Naqib for your help! You put me down the right rabbit hole to a solution.
By default, celery will use the hostname as worker name if your willing to use multiple workers in the same host then specify -n option.
celery -A mysite worker -l info --pool=solo -n worker2#%h
Your code works fine but the task is passed to the first worker, see
DuplicateNodenameWarning with no obvious reason #2938

Celery and flask, same celery app instance

I'm trying to add and remove tasks dynamically from celery beat through a few flask endpoints.
I created a simple project named myApp and a package called flaskr (yes, like the tutorial) with three files in it
myApp
flaskr
__init__.py
routes.py
tasks.py
wsgi.py
This is the endpoint code
#route_blueprint.route('/myApp/add_task')
def add():
print(celery.conf.beat_schedule)
print(hex(id(celery)))
celery.add_periodic_task(10.0, tasks.add.s(55, 2), name='add every 10')
print(celery.conf.beat_schedule)
return ""
I go to the PyCharm console and from one of them I run gunicorn like this:
gunicorn wsgi:app -b localhost:8000
From another console tab I also run Celery like this
celery -A flaskr.celery worker --loglevel=info
And from another I run beat like this
celery -A flaskr.celery beat -l=debug
When I hit the endpoint, in the console I can see the task being added but beat never sends it.
I suspected that flask was setting the task is a differente celery_app instance so I put a print of the celery object that I was trying to modify and yes, it was a different one.
This is from celery start
flaskr:0x110048978
-------------- celery#MacBook-Pro.local v4.3.0 (rhubarb)
---- **** -----
--- * *** * -- Darwin-18.6.0-x86_64-i386-64bit 2019-08-26 17:19:47
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: flaskr:0x110048978
- ** ---------- .> transport: redis://localhost:6379/2
- ** ---------- .> results: redis://localhost:6379/2
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
And this is from the endpoint
0x101e31e80
Question
I'm quite new with python but I guess it makes sense, because i'm triggering the same code from two different processes, one from the celery worker and the other one from flask/gunicorn, so they'll never see each other.
Is there a way to give flask access to the instance initialized from the celery command line instance or should I start the workers from inside flask? (I didn't see that in any documentation from celery nor flask)
This is the full code
__init__.py
from flask import Flask
from celery import Celery
import config
celery = Celery(__name__,
backend=config.CELERY_BACKEND,
broker=config.CELERY_BROKER,
include=['flaskr.tasks'])
#celery.task
def asd(x, y):
print('ADD')
# raise exceptions.Retry(20)
return x + y
def create_app(test_config=None):
# create and configure the app
app = Flask(__name__)
from .routes import route_blueprint
app.register_blueprint(route_blueprint)
return app
tasks.py
from __future__ import absolute_import, unicode_literals
from . import celery
import logging.config
logging.config.fileConfig('logging.conf')
logger = logging.getLogger('myApp')
#celery.task
def add(x, y):
print('ADD')
# raise exceptions.Retry(20)
return x + y
#celery.task(bind=True)
def see_you(self, x, y):
logger.info('Log de see_you')
print(x)
# print("See you in ten seconds!")
print('Initializing from tasks')
print(hex(id(celery)))
print('beat schedule: ' + str(celery.conf.beat_schedule))
# celery.add_periodic_task(10.0, add.s(1, 2), name='add every 10')
# print(str(celery.conf.beat_schedule))
routes.py
from flask import Blueprint
import logging.config
from . import tasks
from . import celery
route_blueprint = Blueprint('route_blueprint', __name__,)
logging.config.fileConfig('logging.conf')
logger = logging.getLogger('myApp')
#route_blueprint.route('/myApp/health')
def health():
return "Health ok"
#route_blueprint.route('/myApp/add_task')
def add():
print(celery.conf.beat_schedule)
# tasks.add.delay(55, 2)
print(hex(id(celery)))
celery.add_periodic_task(10.0, tasks.add.s(55, 2), name='add every 10')
print(celery.conf.beat_schedule)
return "okkk"

Unsure how to run Celery under windows for periodic tasks

I am having difficulty understanding how to run Celery after setting up some scheduled tasks.
Firstly, my project directory is structured as follows:
blogpodapi\api\__init__.py contains
from tasks import app
import celeryconfig
blogpodapi\api\celeryconfig.py contains
from datetime import timedelta
# Celery settings
CELERY_BROKER_URL = 'redis://localhost:6379/0'
BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/1'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
CELERY_IMPORTS = ("api.tasks",)
CELERYBEAT_SCHEDULE = {
'write-test': {
'task': 'api.tasks.addrandom',
'schedule': timedelta(seconds=2),
'args': (16000, 42)
},
}
blogpodapi\api\tasks.py contains
from __future__ import absolute_import
import random
from celery import Celery
app = Celery('blogpodapi')
#app.task
def add(x, y):
r = x + y
print "task arguments: {x}, {y}".format(x=x, y=y)
print "task result: {r}".format(r=r)
return r
#app.task
def addrandom(x, *args): # *args are not used, just there to be interchangable with add(x, y)
y = random.randint(1,100)
print "passing to add(x, y)"
return add(x, y)
blogpodapi\blogpodapi\__init__.py contains
from __future__ import absolute_import
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app # noqa
blogpodapi\blogpodapi\settings.py contains
...
# Celery settings
CELERY_BROKER_URL = 'redis://localhost:6379/0'
BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/1'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
CELERY_IMPORTS = ("api.tasks",)
...
I run celery -A blogpodapi worker --loglevel=info in command prompt and get the following:
D:\blogpodapi>celery -A blogpodapi worker --loglevel=info
-------------- celery#JM v3.1.23 (Cipater)
---- **** -----
--- * *** * -- Windows-8-6.2.9200
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: blogpodapi:0x348a940
- ** ---------- .> transport: redis://localhost:6379/0
- ** ---------- .> results: redis://localhost:6379/1
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. api.tasks.add
. api.tasks.addrandom
. blogpodapi.celery.debug_task
[2016-08-13 13:01:51,108: INFO/MainProcess] Connected to redis://localhost:6379/0
[2016-08-13 13:01:52,122: INFO/MainProcess] mingle: searching for neighbors
[2016-08-13 13:01:55,138: INFO/MainProcess] mingle: all alone
c:\python27\lib\site-packages\celery\fixups\django.py:265: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2016-08-13 13:02:00,157: WARNING/MainProcess] c:\python27\lib\site-packages\celery\fixups\django.py:265: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2016-08-13 13:02:27,790: WARNING/MainProcess] celery#JM ready.
I then run celery -A blogpodapi beat in command prompt and get the following:
D:\blogpodapi>celery -A blogpodapi beat
celery beat v3.1.23 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> redis://localhost:6379/0
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> now (0s)
[2016-08-13 13:02:51,937: INFO/MainProcess] beat: Starting...
For some reason, I can't seem to view my periodic tasks being logged. Is there anything I am doing wrong?
UPDATE: here is my celery.py...
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'blogpodapi.settings')
from django.conf import settings # noqa
app = Celery('blogpodapi')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
You need to run celery beat with the celery settings file
celery -A blogpodapi.celery beat --loglevel=INFO

Celery task results not persisted with rpc

I have been trying to get Celery task results to be routed to another process by making results persisted to a queue and another process can pick results from queue. So, have configured Celery as CELERY_RESULT_BACKEND = 'rpc', but still Python function returned value is not persisted to queue.
Not sure if any other configuration or code change required. Please help.
Here is the code example:
celery.py
from __future__ import absolute_import
from celery import Celery
app = Celery('proj',
broker='amqp://',
backend='rpc://',
include=['proj.tasks'])
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_RESULT_BACKEND = 'rpc',
CELERY_RESULT_PERSISTENT = True,
CELERY_TASK_SERIALIZER = 'json',
CELERY_RESULT_SERIALIZER = 'json'
)
if __name__ == '__main__':
app.start()
tasks.py
from proj.celery import app
#app.task
def add(x, y):
return x + y
Running Celery as
celery worker --app=proj -l info --pool=eventlet -c 4
Solved by using Pika (Python implementation of the AMQP 0-9-1 protocol - https://pika.readthedocs.org) to post results back to celeryresults channel

Celery tasks in Django are always blocking

I have the following setup in my django settings:
CELERY_TASK_RESULT_EXPIRES = timedelta(minutes=30)
CELERY_CHORD_PROPAGATES = True
CELERY_ACCEPT_CONTENT = ['json', 'msgpack', 'yaml']
CELERY_ALWAYS_EAGER = True
CELERY_EAGER_PROPAGATES_EXCEPTIONS = True
BROKER_URL = 'django://'
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend'
I've included this under my installed apps:
'djcelery',
'kombu.transport.django'
My project structure is (django 1.5)
proj
|_proj
__init__.py
celery.py
|_apps
|_myapp1
|_models.py
|_tasks.py
This is my celery.py file:
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings.dev')
app = Celery('proj')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS, related_name='tasks')
In the main __init__.pyI have:
from __future__ import absolute_import
from .celery import app as celery_app
And finally in myapp1/tasks.py I define my task:
#task()
def retrieve():
# Do my stuff
Now, if I launch a django interactive shell and I launch the retrieve task:
result = retrieve.delay()
it always seems to be a blocking call, meaning that the prompt is bloked until the function returns. The result status is SUCCESS, the function actually performs the operations BUT it seems not to be async. What am I missing?
it seems like CELERY_ALWAYS_EAGER causes this
if this is True, all tasks will be executed locally by blocking until
the task returns. apply_async() and Task.delay() will return an
EagerResult instance, which emulates the API and behavior of
AsyncResult, except the result is already evaluated.
That is, tasks will be executed locally instead of being sent to the
queue.

Categories

Resources