I am trying to run celery with the command celeryworker [1] based on this configuration [2], but I get the error [3] when I launch the program. My program is running in medusa1-blank1, and the rabbitmq-server is running in hadoop-medusa-1. You can see in [1] that the $HOST_NAME variable is the medusa1-blank1 and that the celeryconfig.py contains the host address where rabbitmq-server is running.
I looked to my configuration, and I cannot find any error in it. I would like that the log could be more verbose to understand what is going on, but I also don't think that it is possible to do that. Since it looks that the error is not in my code, I am completely clueless in understanding what is going on. Any help to try to debug this?
[1] Script that I use to run with celery
#!/bin/bash
set -xv
# This scripts runs celery in the server host
export C_FORCE_ROOT="true"
HOST_NAME=`hostname`
echo "------------------------"
echo "Initialize celery at $HOST_NAME"
echo "------------------------"
celery worker -n $HOST_NAME -E --loglevel=DEBUG --concurrency=20 -f ./logs/celerydebug.log --config=celeryconfig -Q $HOST_NAME
# celery worker -n medusa1-blank1 -E --loglevel=DEBUG --concurrency=20 -f ./logs/celerydebug.log --config=celeryconfig -Q medusa1-blank1
[2] Configuration that I use:
(medusa-env)xubuntu#medusa1-blank1:~/Programs/medusa-1.0$ cat celeryconfig.py
import os
import sys
# add hadoop python to the env, just for the running
sys.path.append(os.path.dirname(os.path.basename(__file__)))
# broker configuration
BROKER_URL = "amqp://celeryuser:celery#hadoop-medusa-1/celeryvhost"
CELERY_RESULT_BACKEND = "amqp"
CELERY_RESULT_PERSISTENT = True
TEST_RUNNER = 'celery.contrib.test_runner.run_tests'
# for debug
# CELERY_ALWAYS_EAGER = True
# module loaded
CELERY_IMPORTS = ("manager.mergedirs", "manager.system", "manager.utility", "manager.pingdaemon", "manager.hdfs")
[3] Error that I have:
[2016-03-07 10:24:09,482: DEBUG/MainProcess] | Worker: Preparing bootsteps.
[2016-03-07 10:24:09,484: DEBUG/MainProcess] | Worker: Building graph...
[2016-03-07 10:24:09,484: DEBUG/MainProcess] | Worker: New boot order: {Timer, Hub, Queues (intra), Pool, Autoscaler, Autoreloader, StateDB, Beat, Consumer}
[2016-03-07 10:24:09,487: DEBUG/MainProcess] | Consumer: Preparing bootsteps.
[2016-03-07 10:24:09,487: DEBUG/MainProcess] | Consumer: Building graph...
[2016-03-07 10:24:09,491: DEBUG/MainProcess] | Consumer: New boot order: {Connection, Agent, Events, Mingle, Tasks, Control, Heart, Gossip, event loop}
[2016-03-07 10:24:09,491: WARNING/MainProcess] /home/xubuntu/Programs/medusa-1.0/medusa-env/local/lib/python2.7/site-packages/celery/apps/worker.py:161: CDeprecationWarning:
Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers
the ability to execute any command. It's important to secure
your broker from unauthorized access when using pickle, so we think
that enabling pickle should require a deliberate action and not be
the default choice.
If you depend on pickle then you should set a setting to disable this
warning and to be sure that everything will continue working
when you upgrade to Celery 3.2::
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
You must only enable the serializers that you will actually use.
warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED))
[2016-03-07 10:24:09,493: ERROR/MainProcess] Unrecoverable error: AttributeError("'NoneType' object has no attribute 'rstrip'",)
Traceback (most recent call last):
File "/home/xubuntu/Programs/medusa-1.0/medusa-env/local/lib/python2.7/site-packages/celery/worker/__init__.py", line 206, in start
self.blueprint.start(self)
File "/home/xubuntu/Programs/medusa-1.0/medusa-env/local/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
self.on_start()
File "/home/xubuntu/Programs/medusa-1.0/medusa-env/local/lib/python2.7/site-packages/celery/apps/worker.py", line 169, in on_start
string(self.colored.cyan(' \n', self.startup_info())),
File "/home/xubuntu/Programs/medusa-1.0/medusa-env/local/lib/python2.7/site-packages/celery/apps/worker.py", line 230, in startup_info
results=self.app.backend.as_uri(),
File "/home/xubuntu/Programs/medusa-1.0/medusa-env/local/lib/python2.7/site-packages/celery/backends/base.py", line 117, in as_uri
else maybe_sanitize_url(self.url).rstrip("/"))
AttributeError: 'NoneType' object has no attribute 'rstrip'
Don't know which version you're using, but found this bug report:
https://github.com/celery/celery/issues/3094
bottom line, roll back for now.
In my minimum configuration file would be:
CELERY_IMPORTS = ...
AMPQ_USERNAME = os.getenv('AMQP_USERNAME', '...')
AMPQ_PASSWORD = os.getenv('AMQP_PASSWORD', '...')
AMQP_HOST = os.getenv('AMQP_HOST', '172.17.42.1')
AMQP_PORT = int(os.getenv('AMQP_PORT', '5672'))
DEFAULT_BROKER_URL = 'amqp://%s:%s#%s:%d'\
% (AMPQ_USERNAME, AMPQ_PASSWORD, AMQP_HOST, AMQP_PORT)
CELERY_RESULT_BACKEND = 'amqp://%s:%s#%s:%d'\
% (AMPQ_USERNAME, AMPQ_PASSWORD, AMQP_HOST, AMQP_PORT)
BROKER_API = DEFAULT_BROKER_URL
Related
I tried code to send_email 5 times to user as Asynchronous task using Celery and Redis Broker in Django Framework. My Celery server is working and it is responding to the celery cli interface even it is receiving task from Django but after that I am getting Error like:
Traceback (most recent call last):
File "c:\users\vipin\appdata\local\programs\python\python3
es\billiard\pool.py", line 358, in workloop
result = (True, prepare_result(fun(*args, **kwargs)))
File "c:\users\vipin\appdata\local\programs\python\python3
es\celery\app\trace.py", line 544, in _fast_trace_task
tasks, accept, hostname = _loc
ValueError: not enough values to unpack (expected 3, got 0)
task.py -
from celery.decorators import task
from django.core.mail import EmailMessage
import time
#task(name="Sending_Emails")
def send_email(to_email,message):
time1 = 1
while(time1 != 5):
print("Sending Email")
email = EmailMessage('Checking Asynchronous Task', message+str(time1), to=[to_email])
email.send()
time.sleep(1)
time1 += 1
views.py -
print("sending for Queue")
send_email.delay(request.user.email,"Email sent : ")
print("sent for Queue")
settings.py -
# CELERY STUFF
BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'Asia/India'
celery.py -
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'ECartApplication.settings')
app = Celery('ECartApplication')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
I expect Email should be sent 5 times but getting error:
[tasks]
. ECartApplication.celery.debug_task
. Sending_Emails
[2019-05-19 12:41:27,695: INFO/SpawnPoolWorker-2] child process 3628 calling sel
f.run()
[2019-05-19 12:41:27,696: INFO/SpawnPoolWorker-1] child process 5748 calling sel
f.run()
[2019-05-19 12:41:28,560: INFO/MainProcess] Connected to redis://localhost:6379/
/
[2019-05-19 12:41:30,599: INFO/MainProcess] mingle: searching for neighbors
[2019-05-19 12:41:35,035: INFO/MainProcess] mingle: all alone
[2019-05-19 12:41:39,069: WARNING/MainProcess] c:\users\vipin\appdata\local\prog
rams\python\python37-32\lib\site-packages\celery\fixups\django.py:202: UserWarni
ng: Using settings.DEBUG leads to a memory leak, never use this setting in produ
ction environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2019-05-19 12:41:39,070: INFO/MainProcess] celery#vipin-PC ready.
[2019-05-19 12:41:46,448: INFO/MainProcess] Received task: Sending_Emails[db10da
d4-a8ec-4ad2-98a6-60e8c3183dd1]
[2019-05-19 12:41:47,455: ERROR/MainProcess] Task handler raised error: ValueErr
or('not enough values to unpack (expected 3, got 0)')
Traceback (most recent call last):
File "c:\users\vipin\appdata\local\programs\python\python37-32\lib\site-packag
es\billiard\pool.py", line 358, in workloop
result = (True, prepare_result(fun(*args, **kwargs)))
File "c:\users\vipin\appdata\local\programs\python\python37-32\lib\site-packag
es\celery\app\trace.py", line 544, in _fast_trace_task
tasks, accept, hostname = _loc
ValueError: not enough values to unpack (expected 3, got 0)
This is an issue when you running Python over Windows 7/10.
There are a workaround, you just need to use the module eventlet that you can install using pip:
pip install eventlet
After that execute your worker with -P eventlet at the end of the command:
celery -A MyWorker worker -l info -P eventlet
This command below also works on Windows 11:
celery -A core worker --pool=solo -l info
When Celery is receiving a Task, this task never gets executed, just hangs:
These tasks arrive randomly, in very low load.
Celery 3.1.20
[2016-03-02 22:33:08,300: INFO/MainProcess] Received task: catalogue.cluster.deploy_cluster.imbue_cluster[A5C030C4E0]
[2016-03-02 22:33:08,303: INFO/MainProcess] Scaling up 1 processes.
After this, nothing happens.
I started celery with supervisord using a shell script:
source ~/.profile
CELERY_LOGFILE=/usr/local/src/imbue/application/imbue/log/celeryd.log
CELERYD_OPTS=" --loglevel=INFO --autoscale=10,5"
cd /usr/local/src/imbue/application/imbue/conf
exec celery worker -n celeryd#%h -f $CELERY_LOGFILE $CELERYD_OPTS
My configuration:
CELERYD_CHDIR=settings.filepath
CELERY_IGNORE_RESULT = False
CELERY_RESULT_BACKEND = "amqp"
CELERY_TASK_RESULT_EXPIRES = 360000
CELERY_RESULT_PERSISTENT = True
BROKER_URL=<rabbitmq>
CELERY_ENABLE_UTC=True
CELERY_TIMEZONE= "US/Eastern"
CELERY_IMPORTS=("catalogue.cluster.deploy_cluster",
"tools.deploy_tools",)
This is how I call my tasks:
celery = Celery()
celery.config_from_object('conf.celeryconfig')
celery.send_task("catalogue.cluster.deploy_cluster.imbue_cluster",
kwargs={'configuration': configuration,
'job': job_instance,
'api_call': True},
task_id=job_instance.reference)
#task(bind=True, default_retry_delay=300, max_retries=5)
def imbue_cluster(...)
Similar issues:
http://comments.gmane.org/gmane.comp.python.amqp.celery.user/4990
https://groups.google.com/forum/#!topic/cloudify-users/ANvSv7mV7h4
I setup Celery with Django app and broker "Redis"
#task
def proc(product_id,url,did,did_name):
## some long operation here
#task
def Scraping(product_id,num=None):
if num:
num=int(num) ## this for i can set what count of subtasks run now
res=group([proc.s(product_id,url,did,dis[did]) for did in dis.keys()[:num]])()
result = res.get()
return sum(result)
First few subtasks run successful, but later any worker dissapears and new tasks are still in RECEIVED status. Because the worker which must operate it does not exist.
I setup minimal concurency and 2 workers in /etc/default/celeryd.
I monitor CPU and memory usage, no highload detected.
There are no errors in the Celery logs!!!
What's wrong?
[2015-12-19 04:00:30,131: INFO/MainProcess] Task remains.tasks.proc[fd0ec29c-436f-4f60-a1b6-3785342ac173] succeeded in 20.045763085s: 6
[2015-12-19 04:17:28,895: INFO/MainProcess] missed heartbeat from w2#server.domain.com
[2015-12-19 04:17:28,897: DEBUG/MainProcess] w2#server.domain.com joined the party
[2015-12-19 05:11:44,057: INFO/MainProcess] missed heartbeat from w2#server.domain.com
[2015-12-19 05:11:44,058: DEBUG/MainProcess] w2#server.domain.com joined the party
SOLUTION>>> --------------------------------------------------------------)))))
if you use django-celery and want use celery as daemon: no use app() http://docs.celeryproject.org/en/latest/userguide/application.html , instead you must setup your celery in /etc/default/celeryd direct to manage.py of your project as: CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
do not disable heartbeats!!!!!
for use celery with direct to manage.py need:
create arg. CELERY_APP="" in /etc/default/celeryd beacuse if you don't do it, beat will be make run-command with old argument "app".
add line: "export DJANGO_SETTINGS_MODULE="your_app.settings"" to celeryd config if you not use default settings
I am hoping someone can help me as I've looked on Stack Overflow and cannot find a solution to my problem. I am running a Django project and have Supervisor, RabbitMQ and Celery installed. RabbitMQ is up and running and Supervisor is ensuring my celerybeat is running, however, while it logs that the beat has started and sends tasks every 5 minutes (see below), the tasks never actually execute:
My supervisor program conf:
[program:nrv_twitter]
; Set full path to celery program if using virtualenv
command=/Users/tsantor/.virtualenvs/nrv_env/bin/celery beat -A app --loglevel=INFO --pidfile=/tmp/nrv-celerybeat.pid --schedule=/tmp/nrv-celerybeat-schedule
; Project dir
directory=/Users/tsantor/Projects/NRV/nrv
; Logs
stdout_logfile=/Users/tsantor/Projects/NRV/nrv/logs/celerybeat_twitter.log
redirect_stderr=true
autorestart=true
autostart=true
startsecs=10
user=tsantor
; if rabbitmq is supervised, set its priority higher so it starts first
priority=999
Here is the output of the log from the program above:
[2014-12-16 20:29:42,293: INFO/MainProcess] beat: Starting...
[2014-12-16 20:34:08,161: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
[2014-12-16 20:39:08,186: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
[2014-12-16 20:44:08,204: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
[2014-12-16 20:49:08,205: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
[2014-12-16 20:54:08,223: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
Here is my celery.py settings file:
from datetime import timedelta
BROKER_URL = 'amqp://guest:guest#localhost//'
CELERY_DISABLE_RATE_LIMITS = True
CELERYBEAT_SCHEDULE = {
'gettweets-every-5-mins': {
'task': 'twitter.tasks.get_tweets',
'schedule': timedelta(seconds=300) # 300 = every 5 minutes
},
}
Here is my celeryapp.py:
from __future__ import absolute_import
import os
from django.conf import settings
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'app.settings')
app = Celery('app')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
Here is my twitter/tasks.py:
from __future__ import absolute_import
import logging
from celery import shared_task
from twitter.views import IngestTweets
log = logging.getLogger('custom.log')
#shared_task
def get_tweets():
"""
Get tweets and save them to the DB
"""
instance = IngestTweets()
IngestTweets.get_new_tweets(instance)
log.info('Successfully ingested tweets via celery task')
return True
The get_tweets method never gets executed, however I know it works as I can execute get_tweets manually and it works fine.
I have spent two days trying to figure out why its sending due tasks, but not executing them? Any help is greatly appreciated. Thanks in advance.
user2097159 thanks for pointing me in the right direction, I was not aware I also must run a worker using supervisor. I thought it was either a worker or a beat, but now I understand that I must have a worker to handle the task and a beat to fire off the task periodically.
Below is the missing worker config for supervisor:
[program:nrv_celery_worker]
; Worker
command=/Users/tsantor/.virtualenvs/nrv_env/bin/celery worker -A app --loglevel=INFO
; Project dir
directory=/Users/tsantor/Projects/NRV/nrv
; Logs
stdout_logfile=/Users/tsantor/Projects/NRV/nrv/logs/celery_worker.log
redirect_stderr=true
autostart=true
autorestart=true
startsecs=10
user=tsantor
numprocs=1
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
I then reset the RabbitMQ queue. Now that I have both the beat and worker programs managed via supervisor, all is working as intended. Hope this helps someone else out.
You need to a start both a worker process and a beat process. You can create separate processes as described in tsantor's answer, or you can create a single process with both a worker and a beat. This can be more convenient during development (but is not recommended for production).
From "Starting the scheduler" in the Celery documentation:
You can also embed beat inside the worker by enabling the workers -B option, this is convenient if you’ll never run more than one worker node, but it’s not commonly used and for that reason isn’t recommended for production use:
$ celery -A proj worker -B
For expression in Supervisor config files see https://github.com/celery/celery/tree/master/extra/supervisord/ (linked from "Daemonization")
UDATE3: found the issue. See the answer below.
UPDATE2: It seems I might have been dealing with an automatic naming and relative imports problem by running the djcelery tutorial through the manage.py shell, see below. It is still not working for me, but now I get new log error messages. See below.
UPDATE: I added the log at the bottom of the post. It seems the example task is not registered?
Original Post:
I am trying to get django-celery up and running. I was not able to get through the example.
I installed rabbitmq succesfully and went through the tutorials without trouble: http://www.rabbitmq.com/getstarted.html
I then tried to go through the djcelery tutorial.
When I run python manage.py celeryd -l info I get the message:
[Tasks]
- app.module.add
[2011-07-27 21:17:19, 990: WARNING/MainProcess] celery#sequoia has started.
So that looks good. I put this at the top of my settings file:
import djcelery
djcelery.setup_loader()
BROKER_HOST = "localhost"
BROKER_PORT = 5672
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
BROKER_VHOST = "/"
added these to my installed apps:
'djcelery',
here is my tasks.py file in the tasks folder of my app:
from celery.task import task
#task()
def add(x, y):
return x + y
I added this to my django.wsgi file:
os.environ["CELERY_LOADER"] = "django"
Then I entered this at the command line:
>>> from app.module.tasks import add
>>> result = add.delay(4,4)
>>> result
(AsyncResult: 7auathu945gry48- a bunch of stuff)
>>> result.ready()
False
So it looks like it worked, but here is the problem:
>>> result.result
>>> (nothing is returned)
>>> result.get()
When I put in result.get() it just hangs. What am I doing wrong?
UPDATE: This is what running the logger in the foreground says when I start up the worker server:
No handlers could be found for logger “multiprocessing”
[Configuration]
- broker: amqplib://guest#localhost:5672/
- loader: djcelery.loaders.DjangoLoader
- logfile: [stderr]#INFO
- concurrency: 4
- events: OFF
- beat: OFF
[Queues]
- celery: exchange: celery (direct) binding: celery
[Tasks]
- app.module.add
[2011-07-27 21:17:19, 990: WARNING/MainProcess] celery#sequoia has started.
C:\Python27\lib\site-packages\django-celery-2.2.4-py2.7.egg\djcelery\loaders.py:80: UserWarning: Using settings.DEBUG leads to a memory leak, neveruse this setting in production environments!
warnings.warn(“Using settings.DEBUG leads to a memory leak, never”
then when I put in the command:
>>> result = add(4,4)
This appears in the error log:
[2011-07-28 11:00:39, 352: ERROR/MainProcess] Unknown task ignored: Task of kind ‘task.add’ is not registered, please make sure it’s imported. Body->”{‘retries’: 0, ‘task’: ‘tasks.add’, ‘args’: (4,4), ‘expires’: None, ‘ta’: None
‘kwargs’: {}, ‘id’: ‘225ec0ad-195e-438b-8905-ce28e7b6ad9’}”
Traceback (most recent call last):
File “C:\Python27\..\celery\worker\consumer.py”,line 368, in receive_message
Eventer=self.event_dispatcher)
File “C:\Python27\..\celery\worker\job.py”,line 306, in from_message
**kw)
File “C:\Python27\..\celery\worker\job.py”,line 275, in __init__
self.task = tasks[self.task_name]
File “C:\Python27\...\celery\registry.py”, line 59, in __getitem__
Raise self.NotRegistered(key)
NotRegistered: ‘tasks.add’
How do I get this task to be registered and handled properly? thanks.
UPDATE 2:
This link suggested that the not registered error can be due to task name mismatches between client and worker - http://celeryproject.org/docs/userguide/tasks.html#automatic-naming-and-relative-imports
exited the manage.py shell and entered a python shell and entered the following:
>>> from app.module.tasks import add
>>> result = add.delay(4,4)
>>> result.ready()
False
>>> result.result
>>> (nothing returned)
>>> result.get()
(it just hangs there)
so I am getting the same behavior, but new log message. From the log, it appears the server is working but it won't feed the result back out:
[2011-07-28 11:39:21, 706: INFO/MainProcess] Got task from broker: app.module.tasks.add[7e794740-63c4-42fb-acd5-b9c6fcd545c3]
[2011-07-28 11:39:21, 706: INFO/MainProcess] Task app.module.tasks.add[7e794740-63c4-42fb-acd5-b9c6fcd545c3] succeed in 0.04600000038147s: 8
So the server got the task and it computed the correct answer, but it won't send it back? why not?
I found the solution to my problem from another stackoverflow post: Why does Celery work in Python shell, but not in my Django views? (import problem)
I had to add these lines to my settings file:
CELERY_RESULT_BACKEND = "amqp"
CELERY_IMPORTS = ("app.module.tasks", )
then in the task.py file I named the task as such:
#task(name="module.tasks.add")
The server and the client had to be informed of the task names. The celery and django-celery tutorials omit these lines in their tutorials.
if you run celery in debug mode is more easy understand the problem
python manage.py celeryd
What the celery logs says, celery is receiving the task ?
If not probably there is a problem with broker (wrong queue ?)
Give us more detail, in this way we can help you