celery worker node and celery beat startup configuration - python

I have a celery worker that executes a bunch of tasks for data loading. I start up my work node using the following:
celery -A ingest_tasks worker -Q ingest --loglevel=INFO --concurrency=3 -f /var/log/celery/ingest_tasks.log
I have another application I want to setup as a celery beat to periodically grab files from various locations. Since my second application is not under the worker node I should be ok just starting the beat like this correct?
celery -A file_mover beat -s /my/path/celerybeat-schedule
I have never used celery beats before. From reading the docs they seem pretty straightforward. However I wanted to make sure this is correct.

Related

Why do tasks from previous celery workers persist when I create a new worker?

A celery worker created with one file of tasks "remembers" tasks from a previous celery worker which are not in the file of tasks that I am now using.
At a time in the past, I created a celery worker using a file containing tasks called 'tasks.py.'
celery -A tasks worker --loglevel=info
All was well. Now, having moved on, I am attempting to create a celery worker from a different file of tasks called 'dwtasks.py.'
celery -A dwtasks worker --loglevel=info
The splash screen that comes up when the new worker starts lists the tasks defined in 'dwtasks.py' and all the tasks that were defined in 'tasks.py.' It will also fail to create the worker if 'tasks.py' is not available. If I make no reference to 'tasks.py,' how is it possible for all future celery workers to know about those tasks?
Celery is v4.3.0 (rhubarb)
OS is Ubuntu 16.04
You most likely misconfigured Celery. Check your configuration and see whether you use auto-discovery, and where your celery imports are...

Celery unregistered task KeyError

I start the worker by executing the following in the terminal:
celery -A cel_test worker --loglevel=INFO --concurrency=10 -n worker1.%h
Then I get a long looping error message stating that celery has received an unregistered task and has triggered:
KeyError: 'cel_test.grp_all_w_codes.mk_dct' #this is the name of the task
The problem with this is that cel_test.grp_all_w_codes.mk_dct doesn't exist. In fact there isn't even a module cel_test.grp_all_w_codes let alone the task mk_dct. There was once a few days ago but I've since deleted it. I thought maybe there was a .pyc file floating around but there isn't. I also can't find a single reference in my code to the task that's throwing the error. I shut down my computer and restarted the rabbitmq server thinking maybe a reference to something was just stuck in memory but it did not help.
Does anyone have any idea what could be the problem here or what I'm missing?
Well, without knowing your conf files, I can see two reasons that would provoke this:
the mk_dct task wasn't completed when you stopped the worker and delete the module. If you're running with CELERY_ACKS_LATE, it will try to relaunch the task everytime you re run the worker. Try remove this setting, or launch the worker with the purge option.
celery -A cel_test worker --loglevel=INFO --concurrency=10 -n worker1.%h --purge
the mk_dct task is launched by your celery beat. If so, try relaunching celery beat and clearing it's database backend if you had a custom one.
If it does not solve the problem, please post your celery conf, and make sure you have cleaned all the .pyc of your project and restarted everything.

"celeryd stop" is not working

I am using celery in an uncommon way - I am creating custom process when celery is started, this process should be running all the time when celery is running.
Celery workers use this process for their tasks (details not needed).
I run celery from command line and everything is ok:
celery -A celery_jobs.tasks.app worker -B --loglevel=warning
But when I use celeryd to daemonize celery, there is no way to stop it.
Command celeryd stop tries to stop celery but never ends.
When I check process trees in both cases, there is the difference - when running from command line, the parent is obviously celery process (main process which has celery workers as childs). Killing (stopping) the parent celery process will stop all celery workers and my custom process.
But when running with celeryd, my custom process has parent /sbin/init - and calling celeryd stop is not working - seems like main celery process is waiting for something, or is unable to stop my custom process as it is not child process of celery.
I don't know much about processes and it is not easy to find information, because I don't know what I should search for, so any tips are appreciated.
I have had the same problem. I needed a quick solution, so I wrote this bash script
#/bin/bash
/etc/init.d/celeryd stop
sleep 10
export PIDS=`ps -ef | grep celery |grep -v 'grep' | awk '{print $2}'`
for PID in $PIDS; do kill -9 $PID; done;
If the process doesn't stop after 10 seconds, it's a long-time-to-stop candidate, so i decided to stop abruptly
I assume your custom process is not a child of any of your pool worker processes and need not be so.
I use supervisord instead of celeryd to daemonize workers. It can be used to daemonize other processes as well. Such as your custom processes.
In your case your supervisord.conf can have multiple sections. One for each celery worker node and one (or more) for your custom process(es).
When you kill the supervisord process (with -TERM) it will take care of terminating all the workers and your custom process as well. If you use -TERM, then you will need to make sure your custom processes handle them.

Celery beat queue includes obsolete tasks

I'm using periodic celery tasks with Django. I used to have the following task in my app/tasks.py file:
#periodic_task(run_every=timedelta(minutes=2))
def stuff():
...
But now this task has been removed from my app/tasks.py file. However, I keep seeing call to this task in my celery logs:
[2013-05-21 07:08:37,963: ERROR/MainProcess] Received unregistered task of type u'app.tasks.stuff'.
It seems that the celery beat scheduler that I use does not update its queue. This is how the scheduler is defined in my project/settings.py file:
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
Restarting the celery worker does not help. FYI, I use a Redis broker.
How can I either clear or update the celery beat queue so that older tasks are not sent to my celery worker?
Install django-celery.
As cited, this project is not needed to use celery but yet you need this to enable the admin interface at /admin/djcelery/ for managing periodic tasks. Initially there won't be no registered or periodic tasks.
Restart the beat and check the table Periodic tasks again. Beat would have added the existing scheduled tasks into that table with the interval or crontab defined in the settings or the decorators. There you can delete the unwanted tasks.
UPDATE: From celery4, it's recommended to use this package. https://github.com/celery/django-celery-beat
Delete the .pyc file for where the task was originally written. Or, just delete all .pyc files in your projects directory.
This command should work:
find . -name "*.pyc" -exec rm -rf {} \;
How do I remove all .pyc files from a project?

Upstart job to run Celery doesn't stop all the worker processes

I have written an Upstart job to run celery in my Ubuntu server. Here's my configuration file called celeryd.conf
# celeryd - runs the celery daemon
#
# This task is run on startup to run the celery daemon
description "run celery daemon"
start on startup
expect fork
respawn
exec su - trakklr -c "/app/trakklr/src/trakklr celeryd --events --beat --loglevel=debug --settings=production"
When I execute sudo service celeryd start, the celeryd process starts just fine and all the x number of worker process start fine.
..but when I execute, sudo service celeryd stop, it stops most of the processes but a few processes are left hanging.
Why is this happening? I'm using Celery 2.5.3.
Here's an issue from the Github tracker.
https://github.com/celery/django-celery/issues/142
I still use init.d to run celery so this may not apply. With that in mind, stopping the celery service sends the TERM signal to celery. This tells the workers not to accept new tasks but it does not terminate existing tasks. Therefore, depending on how long your tasks take to execute you may see tasks for some time after telling celery to stop. Eventually, they will all shut down unless you have some other problem.
I wasn't able to figure this out but it seemed to be an issue with my older celery version. I found this issue mentioned on their issue-tracker and I guess it points to the same issue:
https://github.com/celery/django-celery/issues/142
I upgraded my celery and django-celery to the 3.x.x versions and this issue was gone.

Categories

Resources