I'm facing a basic issue while setting up python-rq - the rqworker doesn't seem to recognize jobs that are pushed to the queue it's listening on.
Everything is run inside virtualenv
I have the following code:
from redis import Redis
from rq import Queue
from rq.registry import FinishedJobRegistry
from videogen import videogen
import time
redis_conn = Redis(port=5001)
videoq = Queue('medium', connection=redis_conn)
fin_registry = FinishedJobRegistry(connection=redis_conn, name='medium')
jobid = 1024
job = videoq.enqueue(videogen, jobid)
while not job.is_finished:
time.sleep(2)
print job.result
Here videogen is a simple function which immediately returns the integer parameter it receives.
On running rqworker medium and starting the app, there is no result printed. There are NO extra traces at rqworker other than this:
14:41:29 RQ worker started, version 0.5.0
14:41:29
14:41:29 *** Listening on medium...
The redis instance is accessible from the same shell where I run rqworker, as even shows the updated keys:
127.0.0.1:5001> keys *
1) "rq:queues"
2) "rq:queue:medium"
3) "rq:job:9a46f9c5-03e1-4b08-946b-61ad2c3815b1"
So what is possibly missing here?
Silly error - had to supply redis connection url to rqworker
rqworker --url redis://localhost:5001 medium
It's worth noting that this can also happen if you run your RQ workers on Windows, which is not supported by the workers. From the documentation:
RQ workers will only run on systems that implement fork(). Most
notably, this means it is not possible to run the workers on Windows
without using the Windows Subsystem for Linux and running in a bash
shell.
Related
I'm running a Kubernetes cluster with three Celery pods, using a single Redis pod as the message queue. Celery version 4.1.0, Python 3.6.3, standard Redis pod from helm.
At a seemingly quick influx of tasks, the Celery pods to stop processing tasks whatsoever. They will be fine for the first few tasks, but then eventually stop working and my tasks hang.
My tasks follow this format:
#app.task(bind=True)
def my_task(some_param):
result = get_data(some_param)
if result != expectation:
task.retry(throw=False, countdown=5)
And are generally queued as follows:
from my_code import my_task
my_task.apply_async(queue='worker', kwargs=celery_params)
The relevant portion of my deployment.yaml:
command: ["celery", "worker", "-A", "myapp.implementation.celery_app", "-Q", "http"]
The only difference between this cluster and my local cluster, which I use docker-compose to manage, is that the cluster is running a prefork pool and locally I run eventlet pool to be able to put together a code coverage report. I've tried running eventlet on the cluster but I see no difference in the results, the tasks still hang.
Is there something I'm missing about running a Celery worker in Kubernetes? Is there a bug that could be affecting my results? Are there any good ways to break into the cluster to see what's actually happening with this issue?
Running the celery tasks without apply_async allowed me to debug this issue, showing that there was a concurrency logic error in the Celery tasks. I highly recommend this method of debugging Celery tasks.
Instead of:
from my_code import my_task
celery_params = {'key': 'value'}
my_task.apply_async(queue='worker', kwargs=celery_params)
I used:
from my_code import my_task
celery_params = {'key': 'value'}
my_task(**celery_params)
This allowed me to locate the concurrency issue. After I had found the bug, I converted the code back to an asynchronous method call using apply_async.
I am starting a process using python's multiprocessing module. The process is invoked by a post request sent in a django project. When I use development server (python manage.py runserver), the post request takes no time to start the process and finishes immediately.
I deployed the project on production using nginx and uwsgi.
Now when i send the same post request, it takes around 5-7 minutes to complete that request. It only happens with those post requests where I am starting a process. Other post requests work fine.
What could be reason for this delay? And How can I solve this?
Basically the background processing needs to be started outside the WSGI application module.
In WSGI, a python webapp process is started to handle requests, number of which vary depending on configuration. If this process spawns a new process that will block the WSGI process from handling new requests, making the server block and wait for it to finish before handling new requests.
What I would suggest is you use a shared queue in the WSGI application module to feed into a process started outside the WSGI application module. Something like the below. This will start one new processor for each WSGI process outside the webapp module so as not to block requests.
your_app/webapp.py:
from . import bg_queue
def post():
# Webapp POST code here
bg_queue.add(<task data>)
your_app/processor.py:
from multiprocessing import Process
class Consumer(Process):
def __init__(self, input_q):
self.input_q = input_q
def run(self):
while True:
task_data = input_q.get()
<process data>
your_app/__init__.py:
from .processor import Consumer
bg_queue = Queue()
consumer = Consumer(bg_queue)
consumer.daemon = True
consumer.start()
I figured out a workaround (Don't know if it will qualify as an answer).
I wrote the background process as a job in database and used a cronjob to check if I have any job pending and if there are any the cron will start a background process for that job and will exit.
The cron will run every minute so that there is not much delay. This helped in improved performance as it helped me execute heavy tasks like this to run separate from main application.
I have invoked a task that fetches some information remotely with urllib2 a few thousand times. The tasks are scheduled with a random eta (within a week) so they all don't hit the server at the same time. Sometimes I get a 404, sometimes not. I am handling the error in case it happens.
In the RabbitMQ console I can see 16 unacknowledged messages:
I stopped celery, purged the queue and restarted it. The 16 unacknowledged messages were still there.
I have other tasks that go to the same queue and none of them was executed either. After purging, I tried to submit another task and it's state remains ready:
Any ideas how I can find out why messages remain unacknowledged?
Versions:
celery==3.1.4
{rabbit,"RabbitMQ","3.5.3"}
celeryapp.py
CELERYBEAT_SCHEDULE = {
'social_grabber': {
'task': '<django app>.tasks.task_social_grabber',
'schedule': crontab(hour=5, minute=0, day_of_week='sunday'),
},
}
tasks.py
#app.task
def task_social_grabber():
for user in users:
eta = randint(0, 60 * 60 * 24 * 7) #week in seconds
task_social_grabber_single.apply_async((user), countdown=eta)
There is no routing for this task defined so it goes into the default queue: celery. There is one worker processing this queue.
supervisord.conf:
[program:celery]
autostart = true
autorestart = true
command = celery worker -A <django app>.celeryapp:app --concurrency=3 -l INFO -n celery
RabbitMQ broke QoS settings in version 3.3. You need to upgrade celery to at least 3.1.11 (changelog) and kombu to at least 3.0.15 (changelog). You should use the latest versions.
I hit this exact same behavior when 3.3 was released. RabbitMQ flipped the default behavior of the prefetch_count flag. Before this, if a consumer reached the CELERYD_PREFETCH_MULTIPLIER limit in eta'd messages, the worker would up this limit in order to fetch more messages. The change broke this behavior, as the new default behavior denied this capability.
I had a similar symptoms. Messages where getting to the MQ (visible in the charts) but where not picked up by the worker.
This led me to the assumption that my Django app had correctly setup Celery app, but I was missing an import ensuring Celery would be configured during Django startup:
from __future__ import absolute_import
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app # noqa
It is a silly mistake, but the messages getting to the broker, having returned an AsyncResult, got me off track, and made me looking i the wrong places. Then I noticed that setting CELERY_ALWAYS_EAGER = True didn't do squat, event then tasks weren't executed at all.
PS: This may not be an answer to #kev question, but since I got here couple of times, while looking for the solution to my problem, I post it here for anyone in similar situation.
I have a production setup for running celery workers for making a POST / GET request to remote service and storing result, It is handling load around 20k tasks per 15 min.
The problem is that the workers go numb for no reason, no errors, no warnings.
I have tried adding multiprocessing also, the same result.
In log I see the increase in the time of executing task, like succeeded in s
For more details look at https://github.com/celery/celery/issues/2621
If your celery worker get stuck sometimes, you can use strace & lsof to find out at which system call it get stuck.
For example:
$ strace -p 10268 -s 10000
Process 10268 attached - interrupt to quit
recvfrom(5,
10268 is the pid of celery worker, recvfrom(5 means the worker stops at receiving data from file descriptor.
Then you can use lsof to check out what is 5 in this worker process.
lsof -p 10268
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
......
celery 10268 root 5u IPv4 828871825 0t0 TCP 172.16.201.40:36162->10.13.244.205:wap-wsp (ESTABLISHED)
......
It indicates that the worker get stuck at a tcp connection(you can see 5u in FD column).
Some python packages like requests is blocking to wait data from peer, this may cause celery worker hangs, if you are using requests, please make sure to set timeout argument.
Have you seen this page:
https://www.caktusgroup.com/blog/2013/10/30/using-strace-debug-stuck-celery-tasks/
I also faced the issue, when I was using delay shared_task with
celery, kombu, amqp, billiard. After calling the API when I used
delay() for #shared_task, all functions well but when it goes to delay
it hangs up.
So, the issue was In main Application init.py, the below settings
were missing
This will make sure the app is always imported when # Django starts so that shared_task will use this app.
In init.py
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celeryApp
#__all__ = ('celeryApp',)
__all__ = ['celeryApp']
Note1: In place of celery_app put the Aplication name, means the Application mentioned in celery.py import the App and put here
Note2:** If facing only hangs issue in shared task above solution may solve your issue and ignore below matters.
Also wanna mention A=another issue, If anyone facing Error 111
connection issue then please check the versions of amqp==2.2.2,
billiard==3.5.0.3, celery==4.1.0, kombu==4.1.0 whether they are
supporting or not. Mentioned versions are just an example. And Also
check whether redis is install in your system(If any any using redis).
Also make sure you are using Kombu 4.1.0. In the latest version of
Kombu renames async to asynchronous.
Follow this tutorial
Celery Django Link
Add the following to the settings
NB Install redis for both transport and result
# TRANSPORT
CELERY_BROKER_TRANSPORT = 'redis'
CELERY_BROKER_HOST = 'localhost'
CELERY_BROKER_PORT = '6379'
CELERY_BROKER_VHOST = '0'
# RESULT
CELERY_RESULT_BACKEND = 'redis'
CELERY_REDIS_HOST = 'localhost'
CELERY_REDIS_PORT = '6379'
CELERY_REDIS_DB = '1'
Problem
Celery workers are hanging on task execution when using a package which accesses a ZEO server. However, if I were to access the server directly within tasks.py, there's no problem at all.
Background
I have a program that reads and writes to a ZODB file. Because I want multiple users to be able to access and modify this database concurrently, I have it managed by a ZEO server, which should make it safe across multiple processes and threads. I define the database within a module of my program:
from ZEO import ClientStorage
from ZODB.DB import DB
addr = 'localhost', 8090
storage = ClientStorage.ClientStorage(addr, wait=False)
db = DB(storage)
SSCCE
I'm obviously attempting more complex operations, but let's assume I only want the keys of a root object, or its children. I can produce the problem in this context.
I create dummy_package with the above code in a module, databases.py, and a bare-bones module meant to perform database access:
# main.py
def get_keys(dict_like):
return dict_like.keys()
If I don't try any database access with dummy_package, I can import the database and access root without issue:
# tasks.py
from dummy_package import databases
#task()
def simple_task():
connection = databases.db.open()
keys = connection.root().keys()
connection.close(); databases.db.close()
return keys # Works perfectly
However, trying to pass a connection or a child of root makes the task hang indefinitely.
#task()
def simple_task():
connection = databases.db.open()
root = connection.root()
ret = main.get_keys(root) # Hangs indefinitely
...
If it makes any difference, these Celery tasks are accessed by Django.
Question
So, first of all, what's going on here? Is there some sort of race condition caused by accessing the ZEO server in this way?
I could make all database access Celery's responsibility, but that will make for ugly code. Furthermore, it would ruin my program's ability to function as a standalone program. Is it not possible to interact with ZEO within a routine called by a Celery worker?
Do not save an open connection or its root object as a global.
You need a connection per-thread; just because ZEO makes it possible for multiple threads to access, it sounds like you are using something that is not thread-local (e.g. module-level global in databases.py).
Save the db as a global, but call db.open() during each task. See http://zodb.readthedocs.org/en/latest/api.html#connection-pool
I don't completely understand what's going on, but I'm thinking the deadlock has something to do with the fact that Celery uses multiprocessing by default for concurrency. Switching over to using Eventlet for tasks that need to access the ZEO server solved my problem.
My process
Start up a worker that uses Eventlet, and one that uses standard multiproccesing.
celery is the name of the default queue (for historical reasons), so have the Eventlet worker handle this queue:
$ celery worker --concurrency=500 --pool=eventlet --loglevel=debug \
-Q celery --hostname eventlet_worker
$ celery worker --loglevel=debug \
-Q multiprocessing_queue --hostname multiprocessing_worker
Route tasks which need standard multiprocessing to the appropriate queue. All others will be routed to the celery queue (Eventlet-managed) by default. (If using Django, this goes in settings.py):
CELERY_ROUTES = {'project.tasks.ex_task': {'queue': 'multiprocessing_queue'}}