Django: calling celery task declared in same module - python

I have a celery task that's declared on my Django project that I'm trying to call from the same module it's declared in. Right now, it looks like the following:
# myapp.admin.py
from myproject.celery import app as celery_app
#celery_app.task(name='myapp.admin.add')
def add(x, y):
time.sleep(10000)
return x + y
def my_custom_admin_action(modeladmin, request, queryset):
add.delay(2, 4)
# action later declared in the ModelAdmin
Knowing that celery sometimes is complicated with relative imports, I've specified the name. I even added the following to my settings.py:
CELERY_IMPORTS = ('myapp.admin', )
But when I try to use the admin action, I get the following message in my manage.py celeryd output:
[2014-09-18 14:58:25,413: ERROR/MainProcess] Received unregistered task of type 'myapp.admin.add'.
The message has been ignored and discarded.
Did you remember to import the module containing this task?
Or maybe you are using relative imports?
Please see http://bit.ly/gLye1c for more information.
Traceback (most recent call last):
File "/Users/JJ/.virtualenvs/TCJ/lib/python2.7/site-packages/celery/worker/consumer.py", line 455, in on_task_received
strategies[name](message, body,
KeyError: 'myapp.admin.add'
What am I doing wrong here? I even tried importing within the action as from . import add, but that didn't seem to help.

Celery is not picking your add task. One alternate way to solve this is to modify the instance of your Celery.
In myproject/celery.py
change instance of celery
app = Celery('name', backend='your_backend', broker='your_broker')
to
app = Celery('name', backend='your_backend', broker='your_broker',
include['myapp.admin',])

Related

django DoesNotExist matching query does not exist with postgres only [duplicate]

This question already has answers here:
Django related objects are missing from celery task (race condition?)
(3 answers)
Closed 5 years ago.
The assets django app I'm working on runs well with SQLite but I am facing performance issues with deletes / updates of large sets of records and so I am making the transition to a PostgreSQL database.
To do so, I am starting fresh by updating theapp/settings.py to configure PostgreSQL, starting with a fresh db and deleting the assets/migrations/ directory. I am then running:
./manage.py makemigrations assets
./manage.py migrate --run-syncdb
./manage.py createsuperuser
I have a function called within a registered post_create signal. It runs a scan when a Scan object is created. Within the class assets.models.Scan:
#classmethod
def post_create(cls, sender, instance, created, *args, **kwargs):
if not created:
return
from celery.result import AsyncResult
# get the domains for the project, from scan
print("debug: task = tasks.populate_endpoints.delay({})".format(instance.pk))
task = tasks.populate_endpoints.delay(instance.pk)
The offending code:
from celery import shared_task
....
import datetime
#shared_task
def populate_endpoints(scan_pk):
from .models import Scan, Project,
from anotherapp.plugins.sensual import subdomains
scan = Scan.objects.get(pk=scan_pk) #<<<<<<<< django no like
new_entries_count = 0
project = Project.objects.get(id=scan.project.id)
....
The resultant exception DoesNotExist raised:
debug: task = tasks.populate_endpoints.delay(2)
[2017-09-14 23:18:34,950: ERROR/ForkPoolWorker-8] Task assets.tasks.populate_endpoints[4555d329-2873-4184-be60-55e44c46a858] raised unexpected: DoesNotExist('Scan matching query does not exist.',)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 374, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 629, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/src/app/theapp/assets/tasks.py", line 12, in populate_endpoints
scan = Scan.objects.get(pk=scan_pk)
Interacting through ./manage.py shell however indicates that Scan object with pk == 2 exists:
>>> from assets.models import Scan
>>> Scan.objects.all()
<QuerySet [<Scan: ACME Web Test Scan>]>
>>> s = Scan.objects.all().first()
>>> s.pk
2
My only guess is that at the time the post_create function is called, the Scan object still does not exist in the PostgreSQL database, despite save() having been called.
SQLite does not exhibit this problem.
Also, I haven't found a relevant, related problem on stackoverflow as the DoesNotExist exception looks to be fairly generic and caused by many things.
Any ideas on this would be much appreciated.
This is a well known problem resulting from transactions and isolation level - sometimes the transaction has not been commited when the task is executed and if your isolation level is READ COMMITED then you can't indeed read this record from another process. Django 1.9 introduced the on_commit hook as a solution.
NB : technically this question is a duplicate of Django related objects are missing from celery task (race condition?) but the accepted answer uses django-transaction-hooks which has since then been merged into django.

need to restart python while applying Celery config

That's a small story...
I had this error:
AttributeError: 'DisabledBackend' object has no attribute '_get_task_meta_for'
When changed tasks.py, like Diederik said at Celery with RabbitMQ: AttributeError: 'DisabledBackend' object has no attribute '_get_task_meta_for'
app = Celery('tasks', backend='rpc://', broker='amqp://guest#localhost//')
ran it
>>> from tasks import add
>>> result = add.delay(4,50)
>>> result.ready()
got DisabledBackend again ... hmm what was that..
put code to file run.py and it returned True...
from tasks import add
try:
result = add.delay(1,4)
print (result.ready())
except:
print "exept"
I see that if I call >>> from tasks import add after tasks.py changed, it doesn't get the updates... That behaviour is the same for ipython, so because of I can't understand the reason, I advice people to DEBUG from scripts like ~runthis.py
Will be glad for answer which will smash my idea...
If using the interpreter, you need to
reload(tasks)
this will force reimport tasks module

How to integrate APScheduler and Imp?

I have built a plugin-based application where "plugins" (python modules) can be loaded by imp and then scheduled for later execution by APScheduler, I was able to successfully integrate them but I want to implement persistence in case of crashes or application reestarts, so I changed the default memory job store to the SqlAlchemyJobStore, it works quite well the first time you execute the program: tasks are loaded, scheduled, saved at the database and executed at the right time.
Problem is when I try to load the application again I get this traceback:
ERROR:apscheduler.jobstores.default:Unable to restore job "d3e0f0068df54d15986e9b7b6757f665" -- removing it
Traceback (most recent call last):
File "/home/jesus/.local/lib/python2.7/site-packages/apscheduler/jobstores/sqlalchemy.py", line 126, in _get_jobs
jobs.append(self._reconstitute_job(row.job_state))
File "/home/jesus/.local/lib/python2.7/site-packages/apscheduler/jobstores/sqlalchemy.py", line 114, in _reconstitute_job
job.__setstate__(job_state)
File "/home/jesus/.local/lib/python2.7/site-packages/apscheduler/job.py", line 228, in __setstate__
self.func = ref_to_obj(self.func_ref)
File "/home/jesus/.local/lib/python2.7/site-packages/apscheduler/util.py", line 257, in ref_to_obj
raise LookupError('Error resolving reference %s: could not import module' % ref)
LookupError: Error resolving reference __init__:run: could not import module
So it is obvious that there is a problem when attempting to import the function again
Here is my scheduler initialization:
executors = {'default': ThreadPoolExecutor(5)}
jobstores = {'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')}
self.scheduler = BackgroundScheduler(executors = executors,jobstores=jobstores)
I have a "tests" dictionary containing the "plugins" that should be loaded and some parameters, "load_plugin" uses imp to load a plugin by it's name.
for test,parameters in tests.items():
if test in pluggins:
module=load_plugin(pluggins[test])
self.jobs[test]=self.scheduler.add_job(module.run,"interval",seconds=parameters["interval"],name=test)
Any idea about how can I handle reconstituting jobs?
Something in the automatic detection of the module name is going wrong. Hard to say what, but the alternative is to manually give it the proper lookup path as a string (e.g. "package.module:function"). If you can do this, you can avoid this problem.

Use .replace method with Celery sub-tasks

I'm trying to solve a problem in celery:
I have one task that queries an API for ids, and then starts a sub-task for each of these.
I do not know, ahead of time, what the ids are, or how many there are.
For each id, I go through a big calculation that then dumps some data into a database.
After all the sub-tasks are complete, I want to run a summary function (export DB results to an Excel format).
Ideally, I do not want to block my main worker querying the status of the sub-tasks (Celery gets angry if you try this.)
This question looks very similar (if not identical?): Celery: Callback after task hierarchy
So using the "solution" (which is a link to this discussion, I tried the following test script:
# test.py
from celery import Celery, chord
from celery.utils.log import get_task_logger
app = Celery('test', backend='redis://localhost:45000/10?new_join=1', broker='redis://localhost:45000/11')
app.conf.CELERY_ALWAYS_EAGER = False
logger = get_task_logger(__name__)
#app.task(bind=True)
def get_one(self):
print('hello world')
self.replace(get_two.s())
return 1
#app.task
def get_two():
print('Returning two')
return 2
#app.task
def sum_all(data):
print('Logging data')
logger.error(data)
return sum(data)
if __name__ == '__main__':
print('Running test')
x = chord(get_one.s() for i in range(3))
body = sum_all.s()
result = x(body)
print(result.get())
print('Finished w/ test')
It doesn't work for me. I get an error:
AttributeError: 'get_one' object has no attribute 'replace'
Note that I do have new_join=1 in my backend URL, though not the broker. If I put it there, I get an error:
TypeError: _init_params() got an unexpected keyword argument 'new_join'
What am I doing wrong? I'm using the Python 3.4.3 and the following packages:
amqp==1.4.6
anyjson==0.3.3
billiard==3.3.0.20
celery==3.1.18
kombu==3.0.26
pytz==2015.4
redis==2.10.3
The Task.replace method will be added in Celery 3.2: http://celery.readthedocs.org/en/master/whatsnew-3.2.html#task-replace (that changelog entry is misleading, because it suggests that Task.replace existed before and has been changed.)

PermanentTaskFailure: 'module' object has no attribute 'Migrate'

I'm using Nick Johnson's Bulk Update library on google appengine (http://blog.notdot.net/2010/03/Announcing-a-robust-datastore-bulk-update-utility-for-App-Engine). It works wonderfully for other tasks, but for some reason with the following code:
from google.appengine.ext import db
from myapp.main.models import Story, Comment
import bulkupdate
class Migrate(bulkupdate.BulkUpdater):
DELETE_COMPLETED_JOBS_DELAY = 0
DELETE_FAILED_JOBS = False
PUT_BATCH_SIZE = 1
DELETE_BATCH_SIZE = 1
MAX_EXECUTION_TIME = 10
def get_query(self):
return Story.all().filter("hidden", False).filter("visible", True)
def handle_entity(self, entity):
comments = entity.comment_set
for comment in comments:
s = Story()
s.parent_story = comment.story
s.user = comment.user
s.text = comment.text
s.submitted = comment.submitted
self.put(s)
job = Migrate()
job.start()
I get the following error in my logs:
Permanent failure attempting to execute task
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/deferred/deferred.py", line 258, in post
run(self.request.body)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/deferred/deferred.py", line 122, in run
raise PermanentTaskFailure(e)
PermanentTaskFailure: 'module' object has no attribute 'Migrate'
It seems quite bizarre to me. Clearly that class is right above the job, they're in the same file and clearly the job.start is being called. Why can't it see my Migrate class?
EDIT: I added this update job in a newer version of the code, which isn't the default. I invoke the job with the correct URL (http://version.myapp.appspot.com/migrate). Is it possible this is related to the fact that it isn't the 'default' version served by App Engine?
It seems likely that your declaration of the 'Migrate' class is in the handler script (Eg, the one directly invoked by app.yaml). A limitation of deferred is that you can't use it to call functions defined in the handler module.
Incidentally, my bulk update library is deprecated in favor of App Engine's mapreduce support; you should probably use that instead.

Categories

Resources