Celery, RabbitMQ messages keep getting sent - python

Here is the setup - django project with celery and a CloudAMQP rabbitMQ worker doing the message brokering.
My Celery/RabbitMQ settings:
# RabbitMQ & Celery settings
BROKER_URL = 'ampq://guest:guest#localhost:5672/' # Understandably fake
BROKER_POOL_LIMIT = 1
BROKER_CONNECTION_TIMEOUT = 30
BROKER_HEARTBEAT = 30
CELERY_SEND_EVENTS = False
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
A docker container running celery with the following command:
bash -c 'cd django && celery -A pkm_main worker -E -l info --concurrency=3'
The shared_task definition:
from __future__ import absolute_import
from celery import shared_task
#shared_task
def push_notification(user_id, message):
logging.critical('Push notifications sent')
return {'status': 'success'}
And me actually calling it when something happens (I have omitted some of the code because it does not seem to be relevant):
from notificatons.tasks import push_notification
def like_this(self, **args):
# Do like stuff and then do .delay()
push_notification.delay(media.user.id, request.user.username + ' has liked your item')
So when this is ran - everything seems fine and dandy - the output looks like so:
worker_1 | [2016-03-25 09:03:34,888: INFO/MainProcess] Received task: notifications.tasks.push_notification[8443bd88-fa02-4ea4-9bff-8fbec8c91516]
worker_1 | [2016-03-25 09:03:35,333: CRITICAL/Worker-1] Push notifications sent
worker_1 | [2016-03-25 09:03:35,336: INFO/MainProcess] Task notifications.tasks.push_notification[8443bd88-fa02-4ea4-9bff-8fbec8c91516] succeeded in 0.444933412999s: {'status': 'success'}
So from what I gather the task has been ran and executed properly, the messages should be stopped and RabbitMQ should stop.
But in my RabbitMQ Management I see messages getting published and delivered non-stop:
So what I'm gathering from this is that RabbitMQ is trying to send some sort of confirmation and failing and retrying? Is there a way to actually turn this behavior off?
All help and advice is warmly welcomed.
EDIT: Forgot to mentions something important - until I call on push_notification.delay() the message tab is empty save for the heartbeat that comes and goes every 30 seconds. Only after I have called .delay() does this happen.
EDIT 2: CELERYBEAT_SCHEDULE settings (I've tried running with and without them - there was no difference but adding them just in case)
CELERYBEAT_SCHEDULE = {
"minutely_process_all_notifications": {
'task': 'transmissions.tasks.process_all_notifications',
'schedule': crontab(minute='*')
}
}
EDIT 3: Added View code. Also I'm not using the CELERYBEAT_SCHEDULE. I'm just keeping the config in the code for future scheduled tasks
from notifications.tasks import push_notification
class MediaLikesView(BaseView):
def post(self, request, media_id):
media = self.get_object(media_id)
data = {}
data['media'] = media.id
data['user'] = request.user.id
serializer = MediaLikeSerializer(data=data)
if serializer.is_valid():
like = serializer.save()
push_notification.delay(media.user.id, request.user.username + ' has liked your item')
serializer = MediaGetLikeSerializer(like)
return self.get_mocked_pagination_response(status=status.HTTP_204_NO_CONTENT)
return self.get_mocked_pagination_response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)

It's Celery's mingle and gossiping. Disable by adding --without-gossip --without-mingle --without-heartbeat to the command line arguments.
Also don't forget to set BROKER_HEARTBEAT = None when you've disabled heartbeats on the commandline, otherwise you'll disconnected after 30s. It's most often better to rely on TCP keepalive then AMQP heartbeats, or even worse, Celery's own heartbeats.

Related

Celery consumer does not receive messages from SQS queue on LocalStack

I have a SQS queue on a LocalStack server and I'm trying to consume messages from it with a Celery consumer.
It seams that the consumer is properly attached to the queue, for example the queue sqs-test-queue, but it does not receive any message when I try to send one with aws command.
My celeryconfig.py looks like this:
from kombu import (
Exchange,
Queue
)
broker_transport_options = {'region': REGION}
broker_transport = 'sqs'
accept_content = ['application/json']
result_serializer = 'json'
content_encoding = 'utf-8'
task_serializer = 'json'
worker_enable_remote_control = False
worker_send_task_events = False
result_backend = None
task_queues = (
Queue('sqs-test-queue', exchange=Exchange(''), routing_key='sqs-test-queue'),
)
and my tasks.py module looks like this:
from celery import Celery
from kombu.utils.url import quote
AWS_ACCESS_KEY = quote("AWS_ACCESS_KEY")
AWS_SECRET_KEY = quote("AWS_SECRET_KEY")
LOCALSTACK = "<IP>:<PORT>"
broker_url = "sqs://{access}:{secret}#{host}".format(access=AWS_ACCESS_KEY,
secret=AWS_SECRET_KEY,
host=LOCALSTACK)
app = Celery('tasks', broker=broker_url, backend=None)
app.config_from_object('celeryconfig')
#app.task(bind=True, name='tasks.consume', acks_late=True, ignore_result=True)
def consume(self, msg):
# DO SOMETHING WITH THE RECEIVED MESSAGE
return True
Tried to execute it with celery -A tasks worker -l INFO -Q sqs-test-queue and everything seams OK:
...
[tasks]
. tasks.consume
[... INFO/MainProcess] Connected to sqs://AWS_ACCESS_KEY:**#<IP>:<PORT>//
[... INFO/MainProcess] celery#local ready
but when I try to send a message with aws sqs send-message --endpoint-url=http://<IP>:<PORT> --queue-url=http://localhost:<PORT>/queue/sqs-test-queue --message-body="Test message", nothing happens.
What am I doing wrong? Have I missed something in the configuration maybe?
PS: If I try to run the command aws sqs receive-message --endpoint-url=http://<IP>:<PORT> --queue-url=http://localhost:<PORT>/queue/sqs-test-queue, I'm able to get the message.
NOTE:
I'm using Python 3.7.0 and my pip freeze looks like this:
boto3==1.10.16
botocore==1.13.16
celery==4.3.0
kombu==4.6.6
pycurl==7.43.0.3
...
I am going through the same thing as you. To fix it I did a couple of things:
I set the HOSTNAME_EXTERNAL and HOSTNAME env variables in localstack
Set broker_url to sqs://{access}:{secret}#{host}:{port} (as you have it)
Make sure that the celery worker's broker_transport_options does not include the config item: wait_time_seconds since this causes errors with localstack as of February 7th, 2020 (check issue here).
Once I did those two things, it started working, hope it helps.
Celery can't publish or consume arbitrary messages to/from any message queue system. Use kombu for that - that is what Celery uses behind the scenes too.

Celery worker doesn't launch from Python

We have Python 3.6.1 set up with Django, Celery, and Rabbitmq on Ubuntu 14.04. Right now, I'm using the Django debug server (for dev and Apache isn't working). My current problem is that the celery workers get launched from Python and immediately die -- processes show as defunct. If I use the same command in a terminal window, the worker gets created and picks up the task if there is one waiting in the queue.
Here's the command:
celery worker --app=myapp --loglevel=info --concurrency=1 --maxtasksperchild=20 -n celery_1 -Q celery
The same functionality occurs for whichever queues are being set up.
In the terminal, we see the output myapp.settings - INFO - Loading... followed by output that describes the queue and lists the tasks. When running from Python, the last thing we see is the Loading...
In the code, we do have a check to be sure we are not running the celery command as root.
These are the Celery settings from our settings.py file:
CELERY_ACCEPT_CONTENT = ['json','pickle']
CELERY_TASK_SERIALIZER = 'pickle'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_IMPORTS = ('api.tasks',)
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYD_CONCURRENCY = 1
BROKER_POOL_LIMIT = 120 # Note: I tried this set to None but it didn't seem to make any difference
CELERYD_LOG_COLOR = False
CELERY_LOG_FORMAT = '%)asctime)s - $(processName)s - %(levelname)s - %(message)s'
CELERYD_HIJACK_ROOT_LOGGER = False
STATIC_URL = '/static/'
STATIC_ROOT = os.path.join(psconf.BASE_DIR, 'myapp_static/')
BROKER_URL = psconf.MQ_URI
CELERY_RESULT_BACKEND = 'rpc'
CELERY_RESULT_PERSISTENT = True
CELERY_ROUTES = {}
for entry in os.scandir(psconf.PLUGIN_PATH):
if not entry.is_dir() or entry.name == '__pycache__':
continue
plugin_dir = entry.name
settings_file = f'{plugin_dir}.settings'
try:
plugin_tasks = importlib.import_module(settings_file)
queue_name = plugin_tasks.QUEUENAME
except ModuleNotFoundError as e:
logging.warning(e)
except AttributeError:
logging.debug(f'The plugin {plugin_dir} will use the general worker queue.')
else:
CELERY_ROUTES[f'{plugin_dir}.tasks.run'] = {'queue': queue_name}
logging.debug(f'The plugin {plugin_dir} will use the {queue_name} queue.')
Here is the part that kicks off the worker:
class CeleryWorker(BackgroundProcess):
def __init__(self, n, q):
self.name = n
self.worker_queue = q
cmd = f'celery worker --app=myapp --loglevel=info --concurrency=1 --maxtasksperchild=20 -n {self.name" -Q {self.worker_queue}'
super().__init__(cmd, cwd=str(psconf.BASE_DIR))
class BackgroundProcess(subprocess.Popen):
def __init__(self, args, **kwargs):
super().__init__(args, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True, **kwargs)
Any suggestions as to how to get this working from Python are appreciated. I'm new to Rabbitmq/Celery.
Just in case someone else needs this...It turns out that the problem was that the shell script which kicks off this whole app is now being launched with sudo and, even though I thought I was checking so we wouldn't launch the celery worker with sudo, I'd missed something and we were trying to launch as root. That is a no-no. I'm now explicitly using 'sudo -u ' and the workers are starting properly.

Celery, periodic task execution, with concurrency

I would like to launch a periodic task every second but only if the previous task ended (db polling to send task to celery).
In the Celery documentation they are using the Django cache to make a lock.
I tried to use the example:
from __future__ import absolute_import
import datetime
import time
from celery import shared_task
from django.core.cache import cache
LOCK_EXPIRE = 60 * 5
#shared_task
def periodic():
acquire_lock = lambda: cache.add('lock_id', 'true', LOCK_EXPIRE)
release_lock = lambda: cache.delete('lock_id')
a = acquire_lock()
if a:
try:
time.sleep(10)
print a, 'Hello ', datetime.datetime.now()
finally:
release_lock()
else:
print 'Ignore'
with the following configuration:
app.conf.update(
CELERY_IGNORE_RESULT=True,
CELERY_ACCEPT_CONTENT=['json'],
CELERY_TASK_SERIALIZER='json',
CELERY_RESULT_SERIALIZER='json',
CELERYBEAT_SCHEDULE={
'periodic_task': {
'task': 'app_task_management.tasks.periodic',
'schedule': timedelta(seconds=1),
},
},
)
But in the console, I never see the Ignore message and I have Hello every second. It seems that the lock is not working fine.
I launch the periodic task with:
celeryd -B -A my_app
and the worker with:
celery worker -A my_app -l info
Could you please correct my misunderstanding?
From the Django Cache Framework documentation about local-memory cache:
Note that each process will have its own private cache instance, which
means no cross-process caching is possible.
So basically your workers are each dealing with their own cache. If you need a low resource cost cache backend I would recommend File Based Cache or Database Cache, both allow cross-process.

Detect whether Celery is Available/Running

I'm using Celery to manage asynchronous tasks. Occasionally, however, the celery process goes down which causes none of the tasks to get executed. I would like to be able to check the status of celery and make sure everything is working fine, and if I detect any problems display an error message to the user. From the Celery Worker documentation it looks like I might be able to use ping or inspect for this, but ping feels hacky and it's not clear exactly how inspect is meant to be used (if inspect().registered() is empty?).
Any guidance on this would be appreciated. Basically what I'm looking for is a method like so:
def celery_is_alive():
from celery.task.control import inspect
return bool(inspect().registered()) # is this right??
EDIT: It doesn't even look like registered() is available on celery 2.3.3 (even though the 2.1 docs list it). Maybe ping is the right answer.
EDIT: Ping also doesn't appear to do what I thought it would do, so still not sure the answer here.
Here's the code I've been using. celery.task.control.Inspect.stats() returns a dict containing lots of details about the currently available workers, None if there are no workers running, or raises an IOError if it can't connect to the message broker. I'm using RabbitMQ - it's possible that other messaging systems might behave slightly differently. This worked in Celery 2.3.x and 2.4.x; I'm not sure how far back it goes.
def get_celery_worker_status():
ERROR_KEY = "ERROR"
try:
from celery.task.control import inspect
insp = inspect()
d = insp.stats()
if not d:
d = { ERROR_KEY: 'No running Celery workers were found.' }
except IOError as e:
from errno import errorcode
msg = "Error connecting to the backend: " + str(e)
if len(e.args) > 0 and errorcode.get(e.args[0]) == 'ECONNREFUSED':
msg += ' Check that the RabbitMQ server is running.'
d = { ERROR_KEY: msg }
except ImportError as e:
d = { ERROR_KEY: str(e)}
return d
From the documentation of celery 4.2:
from your_celery_app import app
def get_celery_worker_status():
i = app.control.inspect()
availability = i.ping()
stats = i.stats()
registered_tasks = i.registered()
active_tasks = i.active()
scheduled_tasks = i.scheduled()
result = {
'availability': availability,
'stats': stats,
'registered_tasks': registered_tasks,
'active_tasks': active_tasks,
'scheduled_tasks': scheduled_tasks
}
return result
of course you could/should improve the code with error handling...
To check the same using command line in case celery is running as daemon,
Activate virtualenv and go to the dir where the 'app' is
Now run : celery -A [app_name] status
It will show if celery is up or not plus no. of nodes online
Source:
http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/
The following worked for me:
import socket
from kombu import Connection
celery_broker_url = "amqp://localhost"
try:
conn = Connection(celery_broker_url)
conn.ensure_connection(max_retries=3)
except socket.error:
raise RuntimeError("Failed to connect to RabbitMQ instance at {}".format(celery_broker_url))
One method to test if any worker is responding is to send out a 'ping' broadcast and return with a successful result on the first response.
from .celery import app # the celery 'app' created in your project
def is_celery_working():
result = app.control.broadcast('ping', reply=True, limit=1)
return bool(result) # True if at least one result
This broadcasts a 'ping' and will wait up to one second for responses. As soon as the first response comes in, it will return a result. If you want a False result faster, you can add a timeout argument to reduce how long it waits before giving up.
I found an elegant solution:
from .celery import app
try:
app.broker_connection().ensure_connection(max_retries=3)
except Exception as ex:
raise RuntimeError("Failed to connect to celery broker, {}".format(str(ex)))
You can use ping method to check whether any worker (or specific worker) is alive or not https://docs.celeryproject.org/en/latest/_modules/celery/app/control.html#Control.ping
celey_app.control.ping()
You can test on your terminal by running the following command.
celery -A proj_name worker -l INFO
You can review every time your celery runs.
The below script is worked for me.
#Import the celery app from project
from application_package import app as celery_app
def get_celery_worker_status():
insp = celery_app.control.inspect()
nodes = insp.stats()
if not nodes:
raise Exception("celery is not running.")
logger.error("celery workers are: {}".format(nodes))
return nodes
Run celery status to get the status.
When celery is running,
(venv) ubuntu#server1:~/project-dir$ celery status
-> celery#server1: OK
1 node online.
When no celery worker is running, you get the below information displayed in terminal.
(venv) ubuntu#server1:~/project-dir$ celery status
Error: No nodes replied within time constraint

Django Celery tutorial not returning results

UDATE3: found the issue. See the answer below.
UPDATE2: It seems I might have been dealing with an automatic naming and relative imports problem by running the djcelery tutorial through the manage.py shell, see below. It is still not working for me, but now I get new log error messages. See below.
UPDATE: I added the log at the bottom of the post. It seems the example task is not registered?
Original Post:
I am trying to get django-celery up and running. I was not able to get through the example.
I installed rabbitmq succesfully and went through the tutorials without trouble: http://www.rabbitmq.com/getstarted.html
I then tried to go through the djcelery tutorial.
When I run python manage.py celeryd -l info I get the message:
[Tasks]
- app.module.add
[2011-07-27 21:17:19, 990: WARNING/MainProcess] celery#sequoia has started.
So that looks good. I put this at the top of my settings file:
import djcelery
djcelery.setup_loader()
BROKER_HOST = "localhost"
BROKER_PORT = 5672
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
BROKER_VHOST = "/"
added these to my installed apps:
'djcelery',
here is my tasks.py file in the tasks folder of my app:
from celery.task import task
#task()
def add(x, y):
return x + y
I added this to my django.wsgi file:
os.environ["CELERY_LOADER"] = "django"
Then I entered this at the command line:
>>> from app.module.tasks import add
>>> result = add.delay(4,4)
>>> result
(AsyncResult: 7auathu945gry48- a bunch of stuff)
>>> result.ready()
False
So it looks like it worked, but here is the problem:
>>> result.result
>>> (nothing is returned)
>>> result.get()
When I put in result.get() it just hangs. What am I doing wrong?
UPDATE: This is what running the logger in the foreground says when I start up the worker server:
No handlers could be found for logger “multiprocessing”
[Configuration]
- broker: amqplib://guest#localhost:5672/
- loader: djcelery.loaders.DjangoLoader
- logfile: [stderr]#INFO
- concurrency: 4
- events: OFF
- beat: OFF
[Queues]
- celery: exchange: celery (direct) binding: celery
[Tasks]
- app.module.add
[2011-07-27 21:17:19, 990: WARNING/MainProcess] celery#sequoia has started.
C:\Python27\lib\site-packages\django-celery-2.2.4-py2.7.egg\djcelery\loaders.py:80: UserWarning: Using settings.DEBUG leads to a memory leak, neveruse this setting in production environments!
warnings.warn(“Using settings.DEBUG leads to a memory leak, never”
then when I put in the command:
>>> result = add(4,4)
This appears in the error log:
[2011-07-28 11:00:39, 352: ERROR/MainProcess] Unknown task ignored: Task of kind ‘task.add’ is not registered, please make sure it’s imported. Body->”{‘retries’: 0, ‘task’: ‘tasks.add’, ‘args’: (4,4), ‘expires’: None, ‘ta’: None
‘kwargs’: {}, ‘id’: ‘225ec0ad-195e-438b-8905-ce28e7b6ad9’}”
Traceback (most recent call last):
File “C:\Python27\..\celery\worker\consumer.py”,line 368, in receive_message
Eventer=self.event_dispatcher)
File “C:\Python27\..\celery\worker\job.py”,line 306, in from_message
**kw)
File “C:\Python27\..\celery\worker\job.py”,line 275, in __init__
self.task = tasks[self.task_name]
File “C:\Python27\...\celery\registry.py”, line 59, in __getitem__
Raise self.NotRegistered(key)
NotRegistered: ‘tasks.add’
How do I get this task to be registered and handled properly? thanks.
UPDATE 2:
This link suggested that the not registered error can be due to task name mismatches between client and worker - http://celeryproject.org/docs/userguide/tasks.html#automatic-naming-and-relative-imports
exited the manage.py shell and entered a python shell and entered the following:
>>> from app.module.tasks import add
>>> result = add.delay(4,4)
>>> result.ready()
False
>>> result.result
>>> (nothing returned)
>>> result.get()
(it just hangs there)
so I am getting the same behavior, but new log message. From the log, it appears the server is working but it won't feed the result back out:
[2011-07-28 11:39:21, 706: INFO/MainProcess] Got task from broker: app.module.tasks.add[7e794740-63c4-42fb-acd5-b9c6fcd545c3]
[2011-07-28 11:39:21, 706: INFO/MainProcess] Task app.module.tasks.add[7e794740-63c4-42fb-acd5-b9c6fcd545c3] succeed in 0.04600000038147s: 8
So the server got the task and it computed the correct answer, but it won't send it back? why not?
I found the solution to my problem from another stackoverflow post: Why does Celery work in Python shell, but not in my Django views? (import problem)
I had to add these lines to my settings file:
CELERY_RESULT_BACKEND = "amqp"
CELERY_IMPORTS = ("app.module.tasks", )
then in the task.py file I named the task as such:
#task(name="module.tasks.add")
The server and the client had to be informed of the task names. The celery and django-celery tutorials omit these lines in their tutorials.
if you run celery in debug mode is more easy understand the problem
python manage.py celeryd
What the celery logs says, celery is receiving the task ?
If not probably there is a problem with broker (wrong queue ?)
Give us more detail, in this way we can help you

Categories

Resources