I'm new to RQ and am trying to use it for a job which will run in the background. I have managed to set it up, and I'm also able to start more than one worker.
Now I'm trying to run these workers concurrently. I installed supervisor and followed a tutorial to add programs to it, and it worked.
Here is my supervisor configuration:
[program:rqworker]
command=/usr/local/bin/rq worker mysql
process_name=rqworker1-%(process_num)s
numprocs=3
directory=/home/hp/Python/sample
stopsignal=TERM
autostart=true
autorestart=true
stdout_logfile=/home/hp/Python/sample/logs
The worker function is present in the sample directory mentioned above.
The problem is that even after specifying numprocs as 3 in the config file, the workers do not run in parallel.
Here are some screenshots, which show that although multiple workers have been started, they do not work in parallel.
Also, I saw this stackoverflow answer, but it still doesn't divide the jobs amongst the workers!
Could anyone tell me what is wrong with this configuration/what I need to change?
I found the problem; it wasn't with supervisor or rqworker. The manager program was blocking concurrency, by waiting for task completion!
Related
i have celery running on few computers and using flower for monitoring.
the computers is used by different people.
celery beat is generating jobs for all the workers from one of the computer.
every time new coded task is ready, all the workers less the beat-computer will have task not registered exception.
what is the recommended direction to sync all the code to all other computers in the network, is there a prehook kind of mechanism in celery to check for new code?
Unfortunately, you need to update the code on all the workers (nodes) and after that you need to restart all of them. This is by (good) design.
A clever systemd service could in theory be able to
send the graceful shutdown signal
run pip install -U your-project
start the Celery service
This post is in continuation with my previous post - celery how to implement single queue with multiple workers executing in parallel?
I implemented celery to work with eventlet using this command :-
celery -A project worker -P eventlet -l info --concurrency=4
I can see that my tasks are getting moved to active list faster (In flower) but i am not sure if they are executing in parallel? I have a 4 core server for production but I am not utilizing all the cores at the same time.
My question is :-
how can I use all 4 cores to execute tasks in parallel?
Both eventlet/gevent worker types provide great solution for concurrency at the cost of stalling parallelism to 1. To have true parallel task execution and utilise cores, run several Celery instances on same machine.
I know this goes counter to what popular Linux distros have in mind, so just ignore system packages and roll your great configuration from scratch. Systemd service template is your friend.
Another option is to run Celery with prefork pool, you get parallelism at the cost of stalling concurrency to number of workers.
I start the worker by executing the following in the terminal:
celery -A cel_test worker --loglevel=INFO --concurrency=10 -n worker1.%h
Then I get a long looping error message stating that celery has received an unregistered task and has triggered:
KeyError: 'cel_test.grp_all_w_codes.mk_dct' #this is the name of the task
The problem with this is that cel_test.grp_all_w_codes.mk_dct doesn't exist. In fact there isn't even a module cel_test.grp_all_w_codes let alone the task mk_dct. There was once a few days ago but I've since deleted it. I thought maybe there was a .pyc file floating around but there isn't. I also can't find a single reference in my code to the task that's throwing the error. I shut down my computer and restarted the rabbitmq server thinking maybe a reference to something was just stuck in memory but it did not help.
Does anyone have any idea what could be the problem here or what I'm missing?
Well, without knowing your conf files, I can see two reasons that would provoke this:
the mk_dct task wasn't completed when you stopped the worker and delete the module. If you're running with CELERY_ACKS_LATE, it will try to relaunch the task everytime you re run the worker. Try remove this setting, or launch the worker with the purge option.
celery -A cel_test worker --loglevel=INFO --concurrency=10 -n worker1.%h --purge
the mk_dct task is launched by your celery beat. If so, try relaunching celery beat and clearing it's database backend if you had a custom one.
If it does not solve the problem, please post your celery conf, and make sure you have cleaned all the .pyc of your project and restarted everything.
I'm using celery 3.0.11 and djcelery 3.0.11 with python 2.7 and django 1.3.4.
I'm trying to run celeryd as a daemon and I've followed instructions from http://docs.celeryproject.org/en/latest/tutorials/daemonizing.html
When I run the workers using celeryd as described in the link with a python (non-django) configuration, the daemon comes up.
When I run the workers using python manage.py celery worker --loglevel=info to test the workers, they come up fine and start to consume messages.
But when I run celeryd with a django configuration i.e. using manage.py celeryd_multi, I just get a message that says
> Starting nodes...
> <node_name>.<user_name>: OK
But I don't see any daemon running and my messages obviously don't get consumed. There is an empty log file (the one that's configured in the celeryd config file).
I've tried this with a very basic django project as well and I get the same result.
I'm wondering if I'm missing any basic configuration piece. Since I don't get any errors and I don't have any logs, I'm stuck. Running it with sh-x doesn't show anything special either.
Has anyone experienced this before or does anyone have any suggestions on what I can try?
Thanks,
For now I've switched to using supervisord instead of celeryd and I have no issues running multiple workers.
I have written an Upstart job to run celery in my Ubuntu server. Here's my configuration file called celeryd.conf
# celeryd - runs the celery daemon
#
# This task is run on startup to run the celery daemon
description "run celery daemon"
start on startup
expect fork
respawn
exec su - trakklr -c "/app/trakklr/src/trakklr celeryd --events --beat --loglevel=debug --settings=production"
When I execute sudo service celeryd start, the celeryd process starts just fine and all the x number of worker process start fine.
..but when I execute, sudo service celeryd stop, it stops most of the processes but a few processes are left hanging.
Why is this happening? I'm using Celery 2.5.3.
Here's an issue from the Github tracker.
https://github.com/celery/django-celery/issues/142
I still use init.d to run celery so this may not apply. With that in mind, stopping the celery service sends the TERM signal to celery. This tells the workers not to accept new tasks but it does not terminate existing tasks. Therefore, depending on how long your tasks take to execute you may see tasks for some time after telling celery to stop. Eventually, they will all shut down unless you have some other problem.
I wasn't able to figure this out but it seemed to be an issue with my older celery version. I found this issue mentioned on their issue-tracker and I guess it points to the same issue:
https://github.com/celery/django-celery/issues/142
I upgraded my celery and django-celery to the 3.x.x versions and this issue was gone.