Celery/RabbitMQ unacked messages blocking queue?

Celery/RabbitMQ unacked messages blocking queue? - python

I have invoked a task that fetches some information remotely with urllib2 a few thousand times. The tasks are scheduled with a random eta (within a week) so they all don't hit the server at the same time. Sometimes I get a 404, sometimes not. I am handling the error in case it happens.
In the RabbitMQ console I can see 16 unacknowledged messages:
I stopped celery, purged the queue and restarted it. The 16 unacknowledged messages were still there.
I have other tasks that go to the same queue and none of them was executed either. After purging, I tried to submit another task and it's state remains ready:
Any ideas how I can find out why messages remain unacknowledged?
Versions:
celery==3.1.4
{rabbit,"RabbitMQ","3.5.3"}
celeryapp.py
CELERYBEAT_SCHEDULE = {
'social_grabber': {
'task': '<django app>.tasks.task_social_grabber',
'schedule': crontab(hour=5, minute=0, day_of_week='sunday'),
},
}
tasks.py
#app.task
def task_social_grabber():
for user in users:
eta = randint(0, 60 * 60 * 24 * 7) #week in seconds
task_social_grabber_single.apply_async((user), countdown=eta)
There is no routing for this task defined so it goes into the default queue: celery. There is one worker processing this queue.
supervisord.conf:
[program:celery]
autostart = true
autorestart = true
command = celery worker -A <django app>.celeryapp:app --concurrency=3 -l INFO -n celery

RabbitMQ broke QoS settings in version 3.3. You need to upgrade celery to at least 3.1.11 (changelog) and kombu to at least 3.0.15 (changelog). You should use the latest versions.
I hit this exact same behavior when 3.3 was released. RabbitMQ flipped the default behavior of the prefetch_count flag. Before this, if a consumer reached the CELERYD_PREFETCH_MULTIPLIER limit in eta'd messages, the worker would up this limit in order to fetch more messages. The change broke this behavior, as the new default behavior denied this capability.

I had a similar symptoms. Messages where getting to the MQ (visible in the charts) but where not picked up by the worker.
This led me to the assumption that my Django app had correctly setup Celery app, but I was missing an import ensuring Celery would be configured during Django startup:
from __future__ import absolute_import
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app # noqa
It is a silly mistake, but the messages getting to the broker, having returned an AsyncResult, got me off track, and made me looking i the wrong places. Then I noticed that setting CELERY_ALWAYS_EAGER = True didn't do squat, event then tasks weren't executed at all.
PS: This may not be an answer to #kev question, but since I got here couple of times, while looking for the solution to my problem, I post it here for anyone in similar situation.

Related

How to propagate errors in python rq worker tasks to Sentry

I have a Flask app with Sentry error tracking. Now I created some tasks with rq, but their errors do not show up in Sentry Issues stream. I can tell the issues aren't filtered out, because the number of filtered issues doesn't increase. The errors show up in heroku logs --tail.
I run the worker with rq worker homework-fetcher -c my_app.rq_sentry
my_app/rq_sentry.py:
import os
import sentry_sdk
from sentry_sdk.integrations.rq import RqIntegration
dsn = os.environ["SENTRY_DSN"]
print(dsn) # I confirmed this appears in logs, so it is initialized
sentry_sdk.init(dsn=dsn, integrations=[RqIntegration()])
Do I have something wrong, or should I set up a full app confirming this and publish a bug report?
Also, I have a (a bit side-) question:
Should I include RqIntegration and RedisIntegration in sentry settings of the app itself? What is the benefit of these?
Thanks a lot for any help
Edit 1: when I schedule task my_app.nonexistent_module, the worker correctly raises error, which is caught by sentry.
So I maybe change my question: how to propagate Exceptions in rq worker tasks to Sentry?

So after 7 months, I figured it out.
The Flask app uses the app factory pattern. Then in the worker, I need to access the database the same way the app does, and for that, I need the app context. So I from app_factory import create_app, and then create_app().app_context().push(). And that's the issue - in the factory function, I also init Sentry for the app itself, so I end up with Sentry initialized twice - once for the app, and once for the worker. Combined with the fact that I called the app initialization in the worker tasks file (not the config), that probably overridden the correct sentry initialization and prevented the tasks from being correctly logged

django + celery: disable prefetch for one worker, Is there a bug?

I have a Django project with celery
Due to RAM limitations I can only run two worker processes.
I have a mix of 'slow' and 'fast' tasks.
Fast tasks shall be executed ASAP. There can be many fast tasks in a short time frame (0.1s - 3s), so ideally both CPUs should handle them.
Slow tasks might run for a few minutes but the result can be delayed.
Slow tasks occur less often, but it can happen that 2 or 3 are queued up at the same time.
My idea was to have one:
1 celery worker W1 with concurrency 1, that handles only fast tasks
1 celery worker W2 with concurrency 1 that can handle fast and slow tasks.
celery has by default a task prefetch multiplier ( https://docs.celeryproject.org/en/latest/userguide/configuration.html#worker-prefetch-multiplier ) of 4, which means that 4 fast tasks could be queued behind a slow task and could be delayed by several minutes. Thus I'd like to disable prefetch for worker W2. The doc states:
To disable prefetching, set worker_prefetch_multiplier to 1. Changing
that setting to 0 will allow the worker to keep consuming as many
messages as it wants.
However what I observe is, that with a prefetch_multiplier of 1 one task is prefetched and would still be delayed by a slow task.
Is this a documentation bug? Is this an implementation bug? Or do I misunderstand the documentation?
Is there any way to implement what I want?
The commands, that I execute to start the workers are:
celery -A miniclry worker --concurrency=1 -n w2 -Q=fast,slow --prefetch-multiplier 0
celery -A miniclry worker --concurrency=1 -n w1 -Q=fast
my celery settings are default except:
CELERY_BROKER_URL = "pyamqp://*****#localhost:5672/mini"
CELERY_TASK_ROUTES = {
'app1.tasks.task_fast': {"queue": "fast"},
'app1.tasks.task_slow': {"queue": "slow"},
}
my django project's celery.py file is:
from __future__ import absolute_import
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'miniclry.settings')
app = Celery("miniclry", backend="rpc", broker="pyamqp://")
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
The __init__.py of my django project is
from .celery import app as celery_app
__all__ = ('celery_app',)
The code of my workers
import time, logging
from celery import shared_task
from miniclry.celery import app as celery_app
logger = logging.getLogger(__name__)
#shared_task
def task_fast(delay=0.1):
logger.warning("fast in")
time.sleep(delay)
logger.warning("fast out")
#shared_task
def task_slow(delay=30):
logger.warning("slow in")
time.sleep(delay)
logger.warning("slow out")
If I execute following from a management shell I see, that one fast task is only executed after the slow task finished.
from app1.tasks import task_fast, task_slow
task_slow.delay()
for i in range(30):
task_fast.delay()
Can anybody help?
I could post the entire test project if this is considered helpful. Just advise about the recommended SO way of exchanging such kind of projects
Version info:
celery==4.3.0
Django==1.11.25
Python 2.7.12

I confirm the issue, there is a bug in this section of the documentation. worker_prefetch_multiplier = 1 will just as it says, set the worker's prefetch to 1, means worker will hold one more task in addition to one that is executing at the moment.
To actually disable the prefetch you also need to use task_acks_late = True along with the prefetch setting, see this docs section

Django Celery delay() always pushing to default 'celery' queue

I'm ripping my hair out with this one.
The crux of my issue is that, using the Django CELERY_DEFAULT_QUEUE setting in my settings.py is not forcing my tasks to go to that particular queue that I've set up. It always goes to the default celery queue in my broker.
However, if I specify queue=proj:dev in the shared_task decorator, it goes to the correct queue. It behaves as expected.
My setup is as follows:
Django code on my localhost (for testing and stuff). Executing task .delay()'s via Django's shell (manage.py shell)
a remote Redis instance configured as my broker
2 celery workers configured on a remote machine setup and waiting for messages from Redis (On Google App Engine - irrelevant perhaps)
NB: For the pieces of code below, I've obscured the project name and used proj as a placeholder.
celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery, shared_task
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
app = Celery('proj')
app.config_from_object('django.conf:settings', namespace='CELERY', force=True)
app.autodiscover_tasks()
#shared_task
def add(x, y):
return x + y
settings.py
...
CELERY_RESULT_BACKEND = 'django-db'
CELERY_BROKER_URL = 'redis://:{}#{}:6379/0'.format(
os.environ.get('REDIS_PASSWORD'),
os.environ.get('REDIS_HOST', 'alice-redis-vm'))
CELERY_DEFAULT_QUEUE = os.environ.get('CELERY_DEFAULT_QUEUE', 'proj:dev')
The idea is that, for right now, I'd like to have different queues for the different environments that my code exists in: dev, staging, prod. Thus, on Google App Engine, I define an environment variable that is passed based on the individual App Engine service.
Steps
So, with the above configuration, I fire up the shell using ./manage.py shell and run add.delay(2, 2). I get an AsyncResult back but Redis monitor clearly shows a message was sent to the default celery queue:
1497566026.117419 [0 155.93.144.189:58887] "LPUSH" "celery"
...
What am I missing?
Not to throw a spanner in the works, but I feel like there was a point today at which this was actually working. But for the life of me, I can't think what part of my brain is failing me here.
Stack versions:
python: 3.5.2
celery: 4.0.2
redis: 2.10.5
django: 1.10.4

This issue is far more simple than I thought - incorrect documentation!!
The Celery documentation asks us to use CELERY_DEFAULT_QUEUE to set the task_default_queue configuration on the celery object.
Ref: http://docs.celeryproject.org/en/latest/userguide/configuration.html#new-lowercase-settings
We should currently use CELERY_TASK_DEFAULT_QUEUE. This is an inconsistency in the naming of all the other settings' names. It was raised on Github here - https://github.com/celery/celery/issues/3772
Solution summary
Using CELERY_DEFAULT_QUEUE in a configuration module (using config_from_object) has no effect on the queue.
Use CELERY_TASK_DEFAULT_QUEUE instead.

If you are here because you're trying to implement a predefined queue using SQS in Celery and find that Celery creates a new queue called "celery" in SQS regardless of what you say, you've reached the end of your journey friend.
Before passing broker_transport_options to Celery, change your default queue and/or specify the queues you will use explicitly. In my case, I need just the one queue so doing the following worked:
celery.conf.task_default_queue = "<YOUR_PREDEFINED_QUEUE_NAME_IN_SQS">

celery worker not working though rabbitmq has queue buildup

I am getting in touch with celery and I wrote a task by following Tutorial but somehow worker not getting up and I get following log
After entering command:
celery worker -A tasks -l debug
I get a log:
Running a worker with superuser privileges when the
worker accepts messages serialized with pickle is a very bad idea!
If you really want to continue then you have to set the C_FORCE_ROOT
environment variable (but please think about this before you do).
User information: uid=0 euid=0 gid=0 egid=0
And here is my task:
from celery import Celery
app = Celery('tasks', backend='amqp',broker='amqp://sanjay:**#localhost:5672//')
#app.task
def gen_prime(x):
multiples = []
results = []
for i in xrange(2, x+1):
if i not in multiples:
results.append(i)
for j in xrange(i*i, x+1, i):
multiples.append(j)
return results
Though in rabbitmq admin console I see some queue build up when I try to generate prime numbers in ipython console but i am not getting result back on the console.
Here is my console action:
>>> from tasks import gen_prime
>>> pr=gen_prime.delay(10000)
>>> pr.ready()
False
>>>
>>> pr.ready()
False
>>> pr.ready()
False
I am trying to solve this one from last 3 days but I was not able to solve it.

The error message pretty much tells you what's going on in this case. You're trying to run the worker as root (generally a bad idea due to security concerns). If you want to override this and allow it to run, you must set your environment:
export C_FORCE_ROOT="true"
Then run the worker.
Or you can just run it as a different user, which is preferred. You can search for how to add a user. Then you simply login as that user or su and execute your worker.
Since you tagged this digital ocean, here is a link to their tutorial on how to add a user:
https://www.digitalocean.com/community/tutorials/how-to-add-and-delete-users-on-ubuntu-12-04-and-centos-6
Also, celery has some docs regarding how to daemonize your workers. I usually use the supervisord method.
https://celery.readthedocs.org/en/latest/tutorials/daemonizing.html#centos

Don't run celery workers as root.
I recommend using supervisord to manage celery workers - you can use the user configuration directive to specify which user to run the celery workers as.

Celery worker hangs without any error

I have a production setup for running celery workers for making a POST / GET request to remote service and storing result, It is handling load around 20k tasks per 15 min.
The problem is that the workers go numb for no reason, no errors, no warnings.
I have tried adding multiprocessing also, the same result.
In log I see the increase in the time of executing task, like succeeded in s
For more details look at https://github.com/celery/celery/issues/2621

If your celery worker get stuck sometimes, you can use strace & lsof to find out at which system call it get stuck.
For example:
$ strace -p 10268 -s 10000
Process 10268 attached - interrupt to quit
recvfrom(5,
10268 is the pid of celery worker, recvfrom(5 means the worker stops at receiving data from file descriptor.
Then you can use lsof to check out what is 5 in this worker process.
lsof -p 10268
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
......
celery 10268 root 5u IPv4 828871825 0t0 TCP 172.16.201.40:36162->10.13.244.205:wap-wsp (ESTABLISHED)
......
It indicates that the worker get stuck at a tcp connection(you can see 5u in FD column).
Some python packages like requests is blocking to wait data from peer, this may cause celery worker hangs, if you are using requests, please make sure to set timeout argument.
Have you seen this page:
https://www.caktusgroup.com/blog/2013/10/30/using-strace-debug-stuck-celery-tasks/

I also faced the issue, when I was using delay shared_task with
celery, kombu, amqp, billiard. After calling the API when I used
delay() for #shared_task, all functions well but when it goes to delay
it hangs up.
So, the issue was In main Application init.py, the below settings
were missing
This will make sure the app is always imported when # Django starts so that shared_task will use this app.
In init.py
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celeryApp
#__all__ = ('celeryApp',)
__all__ = ['celeryApp']
Note1: In place of celery_app put the Aplication name, means the Application mentioned in celery.py import the App and put here
Note2:** If facing only hangs issue in shared task above solution may solve your issue and ignore below matters.
Also wanna mention A=another issue, If anyone facing Error 111
connection issue then please check the versions of amqp==2.2.2,
billiard==3.5.0.3, celery==4.1.0, kombu==4.1.0 whether they are
supporting or not. Mentioned versions are just an example. And Also
check whether redis is install in your system(If any any using redis).
Also make sure you are using Kombu 4.1.0. In the latest version of
Kombu renames async to asynchronous.

Follow this tutorial
Celery Django Link
Add the following to the settings
NB Install redis for both transport and result
# TRANSPORT
CELERY_BROKER_TRANSPORT = 'redis'
CELERY_BROKER_HOST = 'localhost'
CELERY_BROKER_PORT = '6379'
CELERY_BROKER_VHOST = '0'
# RESULT
CELERY_RESULT_BACKEND = 'redis'
CELERY_REDIS_HOST = 'localhost'
CELERY_REDIS_PORT = '6379'
CELERY_REDIS_DB = '1'

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Celery/RabbitMQ unacked messages blocking queue? - python

Related

How to propagate errors in python rq worker tasks to Sentry

django + celery: disable prefetch for one worker, Is there a bug?

Django Celery delay() always pushing to default 'celery' queue

celery worker not working though rabbitmq has queue buildup

Celery worker hangs without any error

Categories

Resources