I thought that app.control.broadcast would take an #task, but when running the following:
app.send_task("workerTasks_supervisor.task_supervisor_test", args=[], queue='supervisor')
app.control.broadcast("workerTasks_supervisor.task_supervisor_test", args=[], queue="supervisor")
The first succeeds and the second fails with:
[2019-08-01 12:10:52,260: ERROR/MainProcess] pidbox command error: KeyError('task_supervisor_test',)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/kombu/pidbox.py", line 104, in dispatch
reply = handle(method, arguments)
File "/usr/local/lib/python3.5/dist-packages/kombu/pidbox.py", line 126, in handle_cast
return self.handle(method, arguments)
File "/usr/local/lib/python3.5/dist-packages/kombu/pidbox.py", line 120, in handle
return self.handlers[method](self.state, **arguments)
KeyError: 'task_supervisor_test'
The worker is started with
celery worker -A workerTasks_supervisor -n Supervisor --concurrency=1 --loglevel=info -Q supervisor -f /logs/celery_supervisor.log --pidfile=/logs/supervisor_pid.pid
And the task itself is simple:
#app.task()
def task_supervisor_test():
print("working")
What am I doing wrong?
Thanks.
Your assumption is wrong.
Your second line is trying to broadcast a command that you did not implement and it naturally throws an exception.
Beauty of Celery (among many things) is that it allows you to implement your own commands. You may execute them programmatically like you tried above, or through the command line via something like celery -A my.project.app <command> [params...]. It is an extremely powerful concept that I suggest every Celery power-user should learn about.
Ok, it's a different set of things - you need to register them like this:
from celery.worker.control import Panel
#Panel.register
def command_supervisor_test():
print("working")
In the control module, you can find some examples.
from celery.app import control
as an example revoke function uses broadcast internally:
def revoke(self, task_id, destination=None, terminate=False,
signal=TERM_SIGNAME, **kwargs):
return self.broadcast('revoke', destination=destination, arguments={
'task_id': task_id,
'terminate': terminate,
'signal': signal,
}, **kwargs)
Related
I have a implemented a python command which starts celery
#click.command("tasks", help="run this command to start celery task queue")
def tasks():
"""
Runs the celery task queue
"""
from celery.bin import worker
try:
worker = worker.worker(app=app.config.get("task_app"))
worker.run(app="my_app.task_queue", loglevel="info", uid=os.environ["uid"])
except Exception as e:
raise e
I need to create a similar command which starts celery beat and I had the following approach
#click.command("beat", help="run this command to start beat")
def beat():
try:
p = app.config.get("task_app") # gets the current
command = CeleryCommand()
lst = ["celery", "beat"]
command.execute_from_commandline(lst)
except Exception as e:
raise e
the --app parameter will not work with this approach. Is there a way to make this command programmatically work celery -A proj beat without passing from command line?
To start celery beat programmatically:
app.Beat(loglevel='debug').run()
Alternatively:
from celery.apps.beat import Beat
b = Beat(app=your_celery_app, loglevel='debug')
b.run()
For keyword arguments, see celery.beat documentation.
I have written a module that dynamically adds periodic celery tasks based on a list of dictionaries in the projects settings (imported via django.conf.settings).
I do that using a function add_tasks that schedules a function to be called with a specific uuid which is given in the settings:
def add_tasks(celery):
for new_task in settings.NEW_TASKS:
celery.add_periodic_task(
new_task['interval'],
my_task.s(new_task['uuid']),
name='My Task %s' % new_task['uuid'],
)
Like suggested here I use the on_after_configure.connect signal to call the function in my celery.py:
app = Celery('my_app')
#app.on_after_configure.connect
def setup_periodic_tasks(celery, **kwargs):
from add_tasks_module import add_tasks
add_tasks(celery)
This setup works fine for both celery beat and celery worker but breaks my setup where I use uwsgi to serve my django application. Uwsgi runs smoothly until the first time when the view code sends a task using celery's .delay() method. At that point it seems like celery is initialized in uwsgi but blocks forever in the above code. If I run this manually from the commandline and then interrupt when it blocks, I get the following (shortened) stack trace:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'tasks'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'data'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'tasks'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
(SHORTENED HERE. Just contained the trace from the console through my call to this function)
File "/opt/my_app/add_tasks_module/__init__.py", line 42, in add_tasks
my_task.s(new_task['uuid']),
File "/usr/local/lib/python3.6/site-packages/celery/local.py", line 146, in __getattr__
return getattr(self._get_current_object(), name)
File "/usr/local/lib/python3.6/site-packages/celery/local.py", line 109, in _get_current_object
return loc(*self.__args, **self.__kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/app/__init__.py", line 72, in task_by_cons
return app.tasks[
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 44, in __get__
value = obj.__dict__[self.__name__] = self.__get(obj)
File "/usr/local/lib/python3.6/site-packages/celery/app/base.py", line 1228, in tasks
self.finalize(auto=True)
File "/usr/local/lib/python3.6/site-packages/celery/app/base.py", line 507, in finalize
with self._finalize_mutex:
It seems like there is a problem with acquiring a mutex.
Currently I am using a workaround to detect if sys.argv[0] contains uwsgi and then not add the periodic tasks, as only beat needs the tasks, but I would like to understand what is going wrong here to solve the problem more permanently.
Could this problem have something to do with using uwsgi multi-threaded or multi-processed where one thread/process holds the mutex the other needs?
I'd appreciate any hints that can help me solve the problem. Thank you.
I am using: Django 1.11.7 and Celery 4.1.0
Edit 1
I have created a minimal setup for this problem:
celery.py:
import os
from celery import Celery
from django.conf import settings
from myapp.tasks import my_task
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_app.settings')
app = Celery('my_app')
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(
60,
my_task.s(),
name='Testtask'
)
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
tasks.py:
from celery import shared_task
#shared_task()
def my_task():
print('ran')
Make sure that CELERY_TASK_ALWAYS_EAGER=False and that you have a working message queue.
Run:
./manage.py shell -c 'from myapp.tasks import my_task; my_task.delay()'
Wait about 10 seconds before interrupting to see the above error.
So, I have found out that the #shared_task decorator creates the problem. I can circumvent the problem when I declare the task right in the function called by the signal like so:
def add_tasks(celery):
#celery.task
def my_task(uuid):
print(uuid)
for new_task in settings.NEW_TASKS:
celery.add_periodic_task(
new_task['interval'],
my_task.s(new_task['uuid']),
name='My Task %s' % new_task['uuid'],
)
This solution is actually working for me, but I have one more problem with this: I use this code in a pluggable app, so I can't directly access the celery app outside of the signal handler but would like to also be able to call the my_task function from within other code. By defining it within the function it is not available outside of the function, so I cannot import it anywhere else.
I can probably work around this by defining the task function outside of the signal function, and use it with different decorators here and in the tasks.py. I am wondering though if there is a decorator apart from the #shared_task decorator that I can use in the tasks.py that does not create the problem.
The current best solution could be:
task_app.__init__.py:
def my_task(uuid):
# do stuff
print(uuid)
def add_tasks(celery):
celery_my_task = celery.task(my_task)
for new_task in settings.NEW_TASKS:
celery.add_periodic_task(
new_task['interval'],
celery_my_task(new_task['uuid']),
name='My Task %s' % new_task['uuid'],
)
task_app.tasks.py:
from celery import shared_task
from task_app import my_task
shared_my_task = shared_task(my_task)
myapp.celery.py:
import os
from celery import Celery
from django.conf import settings
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_app.settings')
app = Celery('my_app')
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
from task_app import add_tasks
add_tasks(sender)
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
Could you give a try that signal #app.on_after_finalize.connect:
some fast snippet from working project celery==4.1.0, Django==2.0, django-celery-beat==1.1.0 and django-celery-results==1.0.1
#app.on_after_finalize.connect
def setup_periodic_tasks(sender, **kwargs):
""" setup of periodic task :py:func:shopify_data_fetcher.celery.fetch_shopify
based on the schedule defined in: settings.CELERY_BEAT_SCHEDULE
"""
for task_name, task_config in settings.CELERY_BEAT_SCHEDULE.items():
sender.add_periodic_task(
task_config['schedule'],
fetch_shopify.s(**task_config['kwargs']['resource_name']),
name=task_name
)
piece of CELERY_BEAT_SCHEDULE:
CELERY_BEAT_SCHEDULE = {
'fetch_shopify_orders': {
'task': 'shopify.tasks.fetch_shopify',
'schedule': crontab(hour="*/3", minute=0),
'kwargs': {
'resource_name': shopify_constants.SHOPIFY_API_RESOURCES_ORDERS
}
}
}
Given a file myapp.py
from celery import Celery
celery = Celery("myapp")
celery.config_from_object("celeryconfig")
#celery.task(default_retry_delay=5 * 60, max_retries=12)
def add(a, b):
with open("try.txt", "a") as f:
f.write("A trial = {}!\n".format(a + b))
raise add.retry([a, b])
Configured with a celeryconfig.py
CELERY_IMPORTS = ["myapp"]
BROKER_URL = "amqp://"
CELERY_RESULT_BACKEND = "amqp"
I call in the directory that have both files:
$ celeryd -E
And then
$ python -c "import myapp; myapp.add.delay(2, 5)"
or
$ celery call myapp.add --args="[2, 5]"
So the try.txt is created with
A trial = 7!
only once.
That means, the retry was ignored.
I tried many other things:
Using MongoDB as broker and backend and inspecting the database (strangely enough, I can't see anything in my broker "messages" collection even in a "countdown"-scheduled job)
The PING example in here, both with RabbitMQ and MongoDB
Printing on the screen with both print (like the PING example) and logging
Make the retry call in an except block after an enforced Exception is raised, raising or returning the retry(), changing the "throw" parameter to True/False/not specified.
Seeing what's happening with celery flower (in which the "broker" link shows nothing)
But none happened to work =/
My celery report output:
software -> celery:3.0.19 (Chiastic Slide) kombu:2.5.10 py:2.7.3
billiard:2.7.3.28 py-amqp:N/A
platform -> system:Linux arch:64bit, ELF imp:CPython
loader -> celery.loaders.default.Loader
settings -> transport:amqp results:amqp
Is there anything wrong above? What I need to do to make the retry() method work?
I'm using Celery to manage asynchronous tasks. Occasionally, however, the celery process goes down which causes none of the tasks to get executed. I would like to be able to check the status of celery and make sure everything is working fine, and if I detect any problems display an error message to the user. From the Celery Worker documentation it looks like I might be able to use ping or inspect for this, but ping feels hacky and it's not clear exactly how inspect is meant to be used (if inspect().registered() is empty?).
Any guidance on this would be appreciated. Basically what I'm looking for is a method like so:
def celery_is_alive():
from celery.task.control import inspect
return bool(inspect().registered()) # is this right??
EDIT: It doesn't even look like registered() is available on celery 2.3.3 (even though the 2.1 docs list it). Maybe ping is the right answer.
EDIT: Ping also doesn't appear to do what I thought it would do, so still not sure the answer here.
Here's the code I've been using. celery.task.control.Inspect.stats() returns a dict containing lots of details about the currently available workers, None if there are no workers running, or raises an IOError if it can't connect to the message broker. I'm using RabbitMQ - it's possible that other messaging systems might behave slightly differently. This worked in Celery 2.3.x and 2.4.x; I'm not sure how far back it goes.
def get_celery_worker_status():
ERROR_KEY = "ERROR"
try:
from celery.task.control import inspect
insp = inspect()
d = insp.stats()
if not d:
d = { ERROR_KEY: 'No running Celery workers were found.' }
except IOError as e:
from errno import errorcode
msg = "Error connecting to the backend: " + str(e)
if len(e.args) > 0 and errorcode.get(e.args[0]) == 'ECONNREFUSED':
msg += ' Check that the RabbitMQ server is running.'
d = { ERROR_KEY: msg }
except ImportError as e:
d = { ERROR_KEY: str(e)}
return d
From the documentation of celery 4.2:
from your_celery_app import app
def get_celery_worker_status():
i = app.control.inspect()
availability = i.ping()
stats = i.stats()
registered_tasks = i.registered()
active_tasks = i.active()
scheduled_tasks = i.scheduled()
result = {
'availability': availability,
'stats': stats,
'registered_tasks': registered_tasks,
'active_tasks': active_tasks,
'scheduled_tasks': scheduled_tasks
}
return result
of course you could/should improve the code with error handling...
To check the same using command line in case celery is running as daemon,
Activate virtualenv and go to the dir where the 'app' is
Now run : celery -A [app_name] status
It will show if celery is up or not plus no. of nodes online
Source:
http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/
The following worked for me:
import socket
from kombu import Connection
celery_broker_url = "amqp://localhost"
try:
conn = Connection(celery_broker_url)
conn.ensure_connection(max_retries=3)
except socket.error:
raise RuntimeError("Failed to connect to RabbitMQ instance at {}".format(celery_broker_url))
One method to test if any worker is responding is to send out a 'ping' broadcast and return with a successful result on the first response.
from .celery import app # the celery 'app' created in your project
def is_celery_working():
result = app.control.broadcast('ping', reply=True, limit=1)
return bool(result) # True if at least one result
This broadcasts a 'ping' and will wait up to one second for responses. As soon as the first response comes in, it will return a result. If you want a False result faster, you can add a timeout argument to reduce how long it waits before giving up.
I found an elegant solution:
from .celery import app
try:
app.broker_connection().ensure_connection(max_retries=3)
except Exception as ex:
raise RuntimeError("Failed to connect to celery broker, {}".format(str(ex)))
You can use ping method to check whether any worker (or specific worker) is alive or not https://docs.celeryproject.org/en/latest/_modules/celery/app/control.html#Control.ping
celey_app.control.ping()
You can test on your terminal by running the following command.
celery -A proj_name worker -l INFO
You can review every time your celery runs.
The below script is worked for me.
#Import the celery app from project
from application_package import app as celery_app
def get_celery_worker_status():
insp = celery_app.control.inspect()
nodes = insp.stats()
if not nodes:
raise Exception("celery is not running.")
logger.error("celery workers are: {}".format(nodes))
return nodes
Run celery status to get the status.
When celery is running,
(venv) ubuntu#server1:~/project-dir$ celery status
-> celery#server1: OK
1 node online.
When no celery worker is running, you get the below information displayed in terminal.
(venv) ubuntu#server1:~/project-dir$ celery status
Error: No nodes replied within time constraint
UDATE3: found the issue. See the answer below.
UPDATE2: It seems I might have been dealing with an automatic naming and relative imports problem by running the djcelery tutorial through the manage.py shell, see below. It is still not working for me, but now I get new log error messages. See below.
UPDATE: I added the log at the bottom of the post. It seems the example task is not registered?
Original Post:
I am trying to get django-celery up and running. I was not able to get through the example.
I installed rabbitmq succesfully and went through the tutorials without trouble: http://www.rabbitmq.com/getstarted.html
I then tried to go through the djcelery tutorial.
When I run python manage.py celeryd -l info I get the message:
[Tasks]
- app.module.add
[2011-07-27 21:17:19, 990: WARNING/MainProcess] celery#sequoia has started.
So that looks good. I put this at the top of my settings file:
import djcelery
djcelery.setup_loader()
BROKER_HOST = "localhost"
BROKER_PORT = 5672
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
BROKER_VHOST = "/"
added these to my installed apps:
'djcelery',
here is my tasks.py file in the tasks folder of my app:
from celery.task import task
#task()
def add(x, y):
return x + y
I added this to my django.wsgi file:
os.environ["CELERY_LOADER"] = "django"
Then I entered this at the command line:
>>> from app.module.tasks import add
>>> result = add.delay(4,4)
>>> result
(AsyncResult: 7auathu945gry48- a bunch of stuff)
>>> result.ready()
False
So it looks like it worked, but here is the problem:
>>> result.result
>>> (nothing is returned)
>>> result.get()
When I put in result.get() it just hangs. What am I doing wrong?
UPDATE: This is what running the logger in the foreground says when I start up the worker server:
No handlers could be found for logger “multiprocessing”
[Configuration]
- broker: amqplib://guest#localhost:5672/
- loader: djcelery.loaders.DjangoLoader
- logfile: [stderr]#INFO
- concurrency: 4
- events: OFF
- beat: OFF
[Queues]
- celery: exchange: celery (direct) binding: celery
[Tasks]
- app.module.add
[2011-07-27 21:17:19, 990: WARNING/MainProcess] celery#sequoia has started.
C:\Python27\lib\site-packages\django-celery-2.2.4-py2.7.egg\djcelery\loaders.py:80: UserWarning: Using settings.DEBUG leads to a memory leak, neveruse this setting in production environments!
warnings.warn(“Using settings.DEBUG leads to a memory leak, never”
then when I put in the command:
>>> result = add(4,4)
This appears in the error log:
[2011-07-28 11:00:39, 352: ERROR/MainProcess] Unknown task ignored: Task of kind ‘task.add’ is not registered, please make sure it’s imported. Body->”{‘retries’: 0, ‘task’: ‘tasks.add’, ‘args’: (4,4), ‘expires’: None, ‘ta’: None
‘kwargs’: {}, ‘id’: ‘225ec0ad-195e-438b-8905-ce28e7b6ad9’}”
Traceback (most recent call last):
File “C:\Python27\..\celery\worker\consumer.py”,line 368, in receive_message
Eventer=self.event_dispatcher)
File “C:\Python27\..\celery\worker\job.py”,line 306, in from_message
**kw)
File “C:\Python27\..\celery\worker\job.py”,line 275, in __init__
self.task = tasks[self.task_name]
File “C:\Python27\...\celery\registry.py”, line 59, in __getitem__
Raise self.NotRegistered(key)
NotRegistered: ‘tasks.add’
How do I get this task to be registered and handled properly? thanks.
UPDATE 2:
This link suggested that the not registered error can be due to task name mismatches between client and worker - http://celeryproject.org/docs/userguide/tasks.html#automatic-naming-and-relative-imports
exited the manage.py shell and entered a python shell and entered the following:
>>> from app.module.tasks import add
>>> result = add.delay(4,4)
>>> result.ready()
False
>>> result.result
>>> (nothing returned)
>>> result.get()
(it just hangs there)
so I am getting the same behavior, but new log message. From the log, it appears the server is working but it won't feed the result back out:
[2011-07-28 11:39:21, 706: INFO/MainProcess] Got task from broker: app.module.tasks.add[7e794740-63c4-42fb-acd5-b9c6fcd545c3]
[2011-07-28 11:39:21, 706: INFO/MainProcess] Task app.module.tasks.add[7e794740-63c4-42fb-acd5-b9c6fcd545c3] succeed in 0.04600000038147s: 8
So the server got the task and it computed the correct answer, but it won't send it back? why not?
I found the solution to my problem from another stackoverflow post: Why does Celery work in Python shell, but not in my Django views? (import problem)
I had to add these lines to my settings file:
CELERY_RESULT_BACKEND = "amqp"
CELERY_IMPORTS = ("app.module.tasks", )
then in the task.py file I named the task as such:
#task(name="module.tasks.add")
The server and the client had to be informed of the task names. The celery and django-celery tutorials omit these lines in their tutorials.
if you run celery in debug mode is more easy understand the problem
python manage.py celeryd
What the celery logs says, celery is receiving the task ?
If not probably there is a problem with broker (wrong queue ?)
Give us more detail, in this way we can help you