I have a unit-tests for my django project.
Some of views in my django project run celery tasks and I want to check database after these tasks.
I have a separated tests for the celery tasks, where I call them without .delay() method.
The main problem, what is the best and cleanest way to have a celery worker during the jenkins job?
Currently I just run nohup celery -A myqpp worker & before test and kill all running celery at the end of the job.
The best and cleanest way is not to have any celery workers during the Jenkins job, neither any queue/result backend. Utilize CELERY_ALWAYS_EAGER setting to execute your tasks in unit tests locally by blocking until the task returns.
Check out more in Celery documentation: CELERY_ALWAYS_EAGER docs
Just to extend answer about always eager mode, you can see my answer on other question, how you can run celery worker from test setUp https://stackoverflow.com/a/42107423/590233
But few tings need to be done there:
Connect celery worker to test db
Somehow run message broker instance ... (i think that you run it already before test, but cleanest way is to spawn broker instance from setUp as an celery worker)
Related
I am building a app and I am trying to run some tasks everyday. So I saw some answers, blogs and tutorials about using celery, So I liked the idea of using celery for doing background jobs.
But I have some questions about celery :-
As mentioned in Celery Documentation that after setting a celery task , I have to run a command like celery -A proj worker -l INFO which will process all the tasks and after command it will run the tasks, so my question is , I have to stop the running server to execute this command and
what if I deploy Django project with celery on Heroku or Python Anywhere.
Should I have to run command every time Or I can execute this command first then i can start the server ?
If I have to run this command every time to perform background tasks then how is this possible when deploying to Heroku,
Will celery's background tasks will remain running after executing python manage.py run server in only terminal ?
Why I am in doubt ? :-
What I think is, When running celery -A proj worker -l INFO it will process (or Run) the tasks and I cannot execute run server in one terminal.
Any help would be much Appreciated. Thank You
Should I have to run command every time Or I can execute this command first then i can start the server ?
Dockerize your Celecry and write your own script for auto-run.
You can't run celery worker and django application in one terminal simultaneously, because both of them are programs that should be running in parallel. So you should use two terminals, one for django and another for celery worker.
I highly recommend to read this heroku development article for using Celery and Django on heroku.
This post is in continuation with my previous post - celery how to implement single queue with multiple workers executing in parallel?
I implemented celery to work with eventlet using this command :-
celery -A project worker -P eventlet -l info --concurrency=4
I can see that my tasks are getting moved to active list faster (In flower) but i am not sure if they are executing in parallel? I have a 4 core server for production but I am not utilizing all the cores at the same time.
My question is :-
how can I use all 4 cores to execute tasks in parallel?
Both eventlet/gevent worker types provide great solution for concurrency at the cost of stalling parallelism to 1. To have true parallel task execution and utilise cores, run several Celery instances on same machine.
I know this goes counter to what popular Linux distros have in mind, so just ignore system packages and roll your great configuration from scratch. Systemd service template is your friend.
Another option is to run Celery with prefork pool, you get parallelism at the cost of stalling concurrency to number of workers.
I would like to run APScheduler which is a part of WSGI (via Apache's modwsgi with 3 workers) webapp. I am new in WSGI world thus I would appreciate if you could resolve my doubts:
If APScheduler is a part of webapp - it becomes alive just after first request (first after start/reset Apache) which is run at least by one worker? Starting/resetting Apache won't start it - at least one request is needed.
What about concurrent requests - would every worker run same set of APScheduler's tasks or there will be only one set shared between all workers?
Would once running process (webapp run via worker) keep alive (so APScheduler's tasks will execute) or it could terminate after some idle time (as a consequence - APScheduler's tasks won't execute)?
Thank you!
You're right -- the scheduler won't start until the first request comes in.
Therefore running a scheduler in a WSGI worker is not a good idea. A better idea would be to run the scheduler in a separate process and connect to the scheduler when necessary via some RPC mechanism like RPyC or Execnet.
I'm using periodic celery tasks with Django. I used to have the following task in my app/tasks.py file:
#periodic_task(run_every=timedelta(minutes=2))
def stuff():
...
But now this task has been removed from my app/tasks.py file. However, I keep seeing call to this task in my celery logs:
[2013-05-21 07:08:37,963: ERROR/MainProcess] Received unregistered task of type u'app.tasks.stuff'.
It seems that the celery beat scheduler that I use does not update its queue. This is how the scheduler is defined in my project/settings.py file:
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
Restarting the celery worker does not help. FYI, I use a Redis broker.
How can I either clear or update the celery beat queue so that older tasks are not sent to my celery worker?
Install django-celery.
As cited, this project is not needed to use celery but yet you need this to enable the admin interface at /admin/djcelery/ for managing periodic tasks. Initially there won't be no registered or periodic tasks.
Restart the beat and check the table Periodic tasks again. Beat would have added the existing scheduled tasks into that table with the interval or crontab defined in the settings or the decorators. There you can delete the unwanted tasks.
UPDATE: From celery4, it's recommended to use this package. https://github.com/celery/django-celery-beat
Delete the .pyc file for where the task was originally written. Or, just delete all .pyc files in your projects directory.
This command should work:
find . -name "*.pyc" -exec rm -rf {} \;
How do I remove all .pyc files from a project?
I have written an Upstart job to run celery in my Ubuntu server. Here's my configuration file called celeryd.conf
# celeryd - runs the celery daemon
#
# This task is run on startup to run the celery daemon
description "run celery daemon"
start on startup
expect fork
respawn
exec su - trakklr -c "/app/trakklr/src/trakklr celeryd --events --beat --loglevel=debug --settings=production"
When I execute sudo service celeryd start, the celeryd process starts just fine and all the x number of worker process start fine.
..but when I execute, sudo service celeryd stop, it stops most of the processes but a few processes are left hanging.
Why is this happening? I'm using Celery 2.5.3.
Here's an issue from the Github tracker.
https://github.com/celery/django-celery/issues/142
I still use init.d to run celery so this may not apply. With that in mind, stopping the celery service sends the TERM signal to celery. This tells the workers not to accept new tasks but it does not terminate existing tasks. Therefore, depending on how long your tasks take to execute you may see tasks for some time after telling celery to stop. Eventually, they will all shut down unless you have some other problem.
I wasn't able to figure this out but it seemed to be an issue with my older celery version. I found this issue mentioned on their issue-tracker and I guess it points to the same issue:
https://github.com/celery/django-celery/issues/142
I upgraded my celery and django-celery to the 3.x.x versions and this issue was gone.