celery worker not working though rabbitmq has queue buildup - python

I am getting in touch with celery and I wrote a task by following Tutorial but somehow worker not getting up and I get following log
After entering command:
celery worker -A tasks -l debug
I get a log:
Running a worker with superuser privileges when the
worker accepts messages serialized with pickle is a very bad idea!
If you really want to continue then you have to set the C_FORCE_ROOT
environment variable (but please think about this before you do).
User information: uid=0 euid=0 gid=0 egid=0
And here is my task:
from celery import Celery
app = Celery('tasks', backend='amqp',broker='amqp://sanjay:**#localhost:5672//')
#app.task
def gen_prime(x):
multiples = []
results = []
for i in xrange(2, x+1):
if i not in multiples:
results.append(i)
for j in xrange(i*i, x+1, i):
multiples.append(j)
return results
Though in rabbitmq admin console I see some queue build up when I try to generate prime numbers in ipython console but i am not getting result back on the console.
Here is my console action:
>>> from tasks import gen_prime
>>> pr=gen_prime.delay(10000)
>>> pr.ready()
False
>>>
>>> pr.ready()
False
>>> pr.ready()
False
I am trying to solve this one from last 3 days but I was not able to solve it.

The error message pretty much tells you what's going on in this case. You're trying to run the worker as root (generally a bad idea due to security concerns). If you want to override this and allow it to run, you must set your environment:
export C_FORCE_ROOT="true"
Then run the worker.
Or you can just run it as a different user, which is preferred. You can search for how to add a user. Then you simply login as that user or su and execute your worker.
Since you tagged this digital ocean, here is a link to their tutorial on how to add a user:
https://www.digitalocean.com/community/tutorials/how-to-add-and-delete-users-on-ubuntu-12-04-and-centos-6
Also, celery has some docs regarding how to daemonize your workers. I usually use the supervisord method.
https://celery.readthedocs.org/en/latest/tutorials/daemonizing.html#centos

Don't run celery workers as root.
I recommend using supervisord to manage celery workers - you can use the user configuration directive to specify which user to run the celery workers as.

Related

Celery/Django: Get result of periodic task execution

I have a Django 1.7 project using Celery (latest). I have a REST API that receives some parameters, and creates, programmatically, a PeriodicTask. For testing, I'm using a period of seconds:
periodic_task, _= PeriodicTask.objects.get_or_create(name=task_label, task=task_name, interval=interval_schedule)
I store a reference to this tasks somewhere. I start celery beat:
python manage.py celery beat
and a worker:
python manage.py celery worker --loglevel=info
and my task runs as I can see in the worker's output.
I've set the result backend:
CELERY_RESULT_BACKEND = 'djcelery.backends.database:DatabaseBackend'
and with that, I can check the task results using the TaskMeta model. The objects there contains the task_id (the same that I would get if I call the task with .delay() or .apply_async() ), the status, the result, everything, beautiful.
However, I can't find a connection between the PeriodicTask object and TaskMeta.
PeriodicTask has a task property, but its just the task name/path. The id is just a consecutive number, not the task_id from TaskMeta, and I really need to be able to find the task that was executed as a PeriodicTask with TaskMeta so I can offer some monitoring over the status. TaskMeta doesn't have any other value that allows me to identify which task ran (since I will have several ones), so at least I could give a status of the last execution.
I've checked all over Celery docs and in here, but no solution so far.
Any help is highly appreciated.
Thanks
You can run service to monitor task have been performed by using command line
python manage.py celerycam --frequency=10.0
More detail at:
http://www.lexev.org/en/2014/django-celery-setup/

Celery/RabbitMQ unacked messages blocking queue?

I have invoked a task that fetches some information remotely with urllib2 a few thousand times. The tasks are scheduled with a random eta (within a week) so they all don't hit the server at the same time. Sometimes I get a 404, sometimes not. I am handling the error in case it happens.
In the RabbitMQ console I can see 16 unacknowledged messages:
I stopped celery, purged the queue and restarted it. The 16 unacknowledged messages were still there.
I have other tasks that go to the same queue and none of them was executed either. After purging, I tried to submit another task and it's state remains ready:
Any ideas how I can find out why messages remain unacknowledged?
Versions:
celery==3.1.4
{rabbit,"RabbitMQ","3.5.3"}
celeryapp.py
CELERYBEAT_SCHEDULE = {
'social_grabber': {
'task': '<django app>.tasks.task_social_grabber',
'schedule': crontab(hour=5, minute=0, day_of_week='sunday'),
},
}
tasks.py
#app.task
def task_social_grabber():
for user in users:
eta = randint(0, 60 * 60 * 24 * 7) #week in seconds
task_social_grabber_single.apply_async((user), countdown=eta)
There is no routing for this task defined so it goes into the default queue: celery. There is one worker processing this queue.
supervisord.conf:
[program:celery]
autostart = true
autorestart = true
command = celery worker -A <django app>.celeryapp:app --concurrency=3 -l INFO -n celery
RabbitMQ broke QoS settings in version 3.3. You need to upgrade celery to at least 3.1.11 (changelog) and kombu to at least 3.0.15 (changelog). You should use the latest versions.
I hit this exact same behavior when 3.3 was released. RabbitMQ flipped the default behavior of the prefetch_count flag. Before this, if a consumer reached the CELERYD_PREFETCH_MULTIPLIER limit in eta'd messages, the worker would up this limit in order to fetch more messages. The change broke this behavior, as the new default behavior denied this capability.
I had a similar symptoms. Messages where getting to the MQ (visible in the charts) but where not picked up by the worker.
This led me to the assumption that my Django app had correctly setup Celery app, but I was missing an import ensuring Celery would be configured during Django startup:
from __future__ import absolute_import
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app # noqa
It is a silly mistake, but the messages getting to the broker, having returned an AsyncResult, got me off track, and made me looking i the wrong places. Then I noticed that setting CELERY_ALWAYS_EAGER = True didn't do squat, event then tasks weren't executed at all.
PS: This may not be an answer to #kev question, but since I got here couple of times, while looking for the solution to my problem, I post it here for anyone in similar situation.

Celery task is hanging with http request

I'm testing celery tasks and have stumbled on issue. If in task exists code with request(through urllib.urlopen) then it's hanging. What reasons can be?
I just try start on minimal config with Flask.
I used rabbitmq and redis for broker and backend, but result is the same.
file(run_celery.py) with tasks:
...import celery and flask app...
celery = Celery(
app.import_name,
backend=app.config['CELERY_BROKER_URL'],
broker=app.config['CELERY_BROKER_URL']
)
#celery.task
def test_task(a):
print(a)
print(requests.get('http://google.com'))
In this way I launched worker:
celery -A run_celery.celery worker -l debug
After this, I run ipython and call task.
from run_celery import test_task
test_task.apply_async(('sfas',))
Worker's beginning perform task:
...
Received task: run_celery.test_task...
sfas
Starting new HTTP connection (1)...
And after this it's hanging.
This behavior is actual only if task contain request.
What Did I do wrong?
I found reason in my code and very wondered O_o. I don't know why this is happening but within file with tasks, exists import Model and when it is executing then perform initialization instance MagentoAPI(https://github.com/bernieke/python-magento). If I comment out this initialization then requests in celery tasks perform correctly.

Celery worker hangs without any error

I have a production setup for running celery workers for making a POST / GET request to remote service and storing result, It is handling load around 20k tasks per 15 min.
The problem is that the workers go numb for no reason, no errors, no warnings.
I have tried adding multiprocessing also, the same result.
In log I see the increase in the time of executing task, like succeeded in s
For more details look at https://github.com/celery/celery/issues/2621
If your celery worker get stuck sometimes, you can use strace & lsof to find out at which system call it get stuck.
For example:
$ strace -p 10268 -s 10000
Process 10268 attached - interrupt to quit
recvfrom(5,
10268 is the pid of celery worker, recvfrom(5 means the worker stops at receiving data from file descriptor.
Then you can use lsof to check out what is 5 in this worker process.
lsof -p 10268
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
......
celery 10268 root 5u IPv4 828871825 0t0 TCP 172.16.201.40:36162->10.13.244.205:wap-wsp (ESTABLISHED)
......
It indicates that the worker get stuck at a tcp connection(you can see 5u in FD column).
Some python packages like requests is blocking to wait data from peer, this may cause celery worker hangs, if you are using requests, please make sure to set timeout argument.
Have you seen this page:
https://www.caktusgroup.com/blog/2013/10/30/using-strace-debug-stuck-celery-tasks/
I also faced the issue, when I was using delay shared_task with
celery, kombu, amqp, billiard. After calling the API when I used
delay() for #shared_task, all functions well but when it goes to delay
it hangs up.
So, the issue was In main Application init.py, the below settings
were missing
This will make sure the app is always imported when # Django starts so that shared_task will use this app.
In init.py
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celeryApp
#__all__ = ('celeryApp',)
__all__ = ['celeryApp']
Note1: In place of celery_app put the Aplication name, means the Application mentioned in celery.py import the App and put here
Note2:** If facing only hangs issue in shared task above solution may solve your issue and ignore below matters.
Also wanna mention A=another issue, If anyone facing Error 111
connection issue then please check the versions of amqp==2.2.2,
billiard==3.5.0.3, celery==4.1.0, kombu==4.1.0 whether they are
supporting or not. Mentioned versions are just an example. And Also
check whether redis is install in your system(If any any using redis).
Also make sure you are using Kombu 4.1.0. In the latest version of
Kombu renames async to asynchronous.
Follow this tutorial
Celery Django Link
Add the following to the settings
NB Install redis for both transport and result
# TRANSPORT
CELERY_BROKER_TRANSPORT = 'redis'
CELERY_BROKER_HOST = 'localhost'
CELERY_BROKER_PORT = '6379'
CELERY_BROKER_VHOST = '0'
# RESULT
CELERY_RESULT_BACKEND = 'redis'
CELERY_REDIS_HOST = 'localhost'
CELERY_REDIS_PORT = '6379'
CELERY_REDIS_DB = '1'

How to purge all tasks of a specific queue with celery in python?

How to purge all scheduled and running tasks of a specific que with celery in python? The questions seems pretty straigtforward, but to add I am not looking for the command line code
I have the following line, which defines the que and would like to purge that que to manage tasks:
CELERY_ROUTES = {"socialreport.tasks.twitter_save": {"queue": "twitter_save"}}
At 1 point in time I wanna purge all tasks in the que twitter_save with python code, maybe with a broadcast function? I couldn't find the documentation about this. Is this possible?
just to update #Sam Stoelinga answer for celery 3.1, now it can be done like this on a terminal:
celery amqp queue.purge <QUEUE_NAME>
For Django be sure to start it from the manage.py file:
./manage.py celery amqp queue.purge <QUEUE_NAME>
If not, be sure celery is able to point correctly to the broker by setting the --broker= flag.
The original answer does not work for Celery 3.1. Hassek's update is the correct command if you want to do it from the command line. But if you want to do it programmatically, do this:
Assuming you ran your Celery app as:
celery_app = Celery(...)
Then:
import celery.bin.amqp
amqp = celery.bin.amqp.amqp(app = celery_app)
amqp.run('queue.purge', 'name_of_your_queue')
This is handy for cases where you've enqueued a bunch of tasks, and one task encounters a fatal condition that you know will prevent the rest of the tasks from executing.
E.g. you enqueued a bunch of web crawler tasks, and in the middle of your tasks your server's IP address gets blocked. There's no point in executing the rest of the tasks. So in that case, your task it self can purge its own queue.
Lol it's quite easy, hope somebody can help me still though.
from celery.bin.camqadm import camqadm
camqadm('queue.purge', queue_name_as_string)
The only problem with this I still need to stop the celeryd before purging the que, after purging I need to run the celeryd again to handle tasks for the queue. Will update this question if i succeed.
I succeeded, but please correct me if this is not a good method to stop the celeryd, purge que and start it again. I know I am using term, because I actually want it to be terminated the task.
kill_command = "ps auxww | grep 'celeryd -n twitter_save' | awk '{print $2}' | xargs kill -9"
subprocess.call(kill_command, shell=True)
camqadm('queue.purge', 'twitter_save')
rerun_command = "/home/samos/Software/virt_env/twittersyncv1/bin/python %s/manage.py celeryd -n twitter_save -l info -Q twitter_save" % settings.PROJECT_ROOT
os.popen(rerun_command+' &')
send_task("socialreport.tasks.twitter_save")

Categories

Resources