I'd like to run my Django app's tests in several threads (possibly dozens) in parallel. This is because my app spends almost all of its time waiting for remote requests, and I reckon that if I run the tests in parallel, they would all work at the same time without slowing each other down, and the whole suite would be over pretty quickly.
But... Tests are currently running with Django's runserver, which is single-threaded. So it won't be able to serve dozens of requests in parallel.
(I use Django's ./manage.py test with django_nose to invoke the tests.)
One idea I have is to use devserver instead. The question is, will it automatically be used when invoking ./manage.py test?
And another question is: I ran into devserver rather randomly, and I don't know whether it has any competitors that might be better. Does it?
use uWSGI
pip install uwsgi
Create .ini for your project:
[uwsgi]
# set the http port
http = :8000
# change to django project directory
chdir = /var/www/myapp
# add /var/www to the pythonpath, in this way we can use the project.app format
pythonpath = /var/www
# set the project settings name
env = DJANGO_SETTINGS_MODULE=myapp.settings
# load django
module = django.core.handlers.wsgi:WSGIHandler()
Start it with built-in http server
uwsgi --ini django.ini --async 10
async — number of threads
http://projects.unbit.it/uwsgi/wiki/Quickstart
http://projects.unbit.it/uwsgi/wiki/Doc095
I've recently began delving into django-celery which is an asynchronous task queue for django. It allows you to queue up tasks to run asynchronously so that you don't have to wait for responses. It's simple to install and get started and it would allow your application to utilize asynchronous queueing instead of just your test suite.
http://django-celery.readthedocs.org/en/latest/getting-started/index.html
Related
For a Django-based server I require the simultaneous running of scripts in a fashion similar to cronjobs. I want to avoid the explicit usage of cronjobs and instead, integrate these periodic tasks to the HTTP server initialization - that is, when I run either manage.py runserver or a very similar management command, alongside the HTTP daemon, two other processes start that can perform my tasks periodically.
I already created management commands for these scripts. What are my options?
My best guess is starting two threads either in AppConfig.ready() like suggested here or somehow in manage.py itself. I'm not entirely sure if it has any caveats, though.
Since asking this question, I realized my only solution is the initialization of threads and also that I should do it explicitly in either asgi.py or wsgi.py, depending on my production solution - runserver management command is not suitable for production.
I have a follow-on / clarification question related to an older question
I have 2 servers (for now). 1 server runs a django web application. The other server runs pure python scripts that are CRON-scheduled data acquisition & processing jobs for the web app.
There is a use case where user activity in the web application (updating a certain field) should trigger a series of actions by the backend server. I could stick with CRON but as we scale up, I can imagine running into trouble. Celery seems like a good solution except I'm unclear how to implement it. (Yes, I did read the getting started guide).
I want the web application to send tasks to a specific queue but the backend server to actually execute the work.
Assuming that both servers are using the same broker URL,
Do I need to define stub tasks in Djando or can I just use the celery.send_task method?
Should I still be using django-celery?
Meanwhile the backend server will be running Celery with the full implementation of the tasks and workers?
I decided to try it and work through any issues that came up.
On my django server, I did not use django-celery. I installed celery and redis (via pip) and followed most of the instructions in the First Steps with Django:
updated proj/proj/settings.py file to include the bare minimum of
configuration for Celery such as the BROKER_URL
created the proj/proj/celery.py file but without the task defined
at the bottom
updated the proj/proj/__init__.py file as documented
Since the server running django wasn't actually going to execute any
Celery tasks, in the view that would trigger a task, I added the
following:
from proj.celery import app as celery_app
try:
# send it to celery for backend processing
celery_app.send_task('tasks.mytask', kwargs={'some_id':obj.id,'another_att':obj.att}, queue='my-queue')
except Exception as err:
print('Issue sending task to Celery')
print err
The other server had the following installed: celery and redis (I used an AWS Elasticache redis instance for this testing).
This server had the following files:
celeryconfig.py will all of my Celery configuration and queues
defined, pointing to the same BROKER_URL as the django server
tasks.py with the actual code for all of my tasks
The celery workers were then started on this server, using the standard command: celery -A tasks worker -Q my-queue1,my-queue2
For testing, the above worked. Now I just need to make celery run in the background and optimize the number of workers/queue.
If anyone has additional comments or improvements, I'd love to hear them!
I am working on a web application that uses a permanent object MyService. Using a web interface I am dynamically updating its state and monitor its behavior. Now I would like to periodically call one of its methods. I was thinking of using celery PeriodicTask but run into some scope issues. It seems I need to execute three different processes:
python manage.py runserver
python manage.py celery worker
python manage.py celerybeat
The problem is that even if I ensure that MyService is a singleton that can be safely used by more than one thread, celery creates its own fresh copy of the object. Is there a way I could share this object between both django server and celery main process? I tried to find a way to start celery from within django script but until now with no success. Would appreciate any help.
If you need to share something between multiple processes or maybe even multiple machines (eg. your workers could run on a seperate machine) the best (and probably easiest) practice to share information would be using an external service.
In the simplest case you could use Django's DB, but if you encounter that this is not suitable for you, for example if you have a heavy write load you can use something like Redis or Memcache (which you can also talk to via Django's caching API). These will enable you to be able to handle a big write load and besides you can use eg. Redis as a queue for celery as well.
One can easily start Django using management command like this:
management.call_command('runserver', interactive=False)
But it actually blocks execution.
Any workaround apart from subprocess/threading/multiprocessing.
I mean how to do it in more native fashion?
A management command is not "starting django".
You "start django" by deploying on any number of web servers, each of which has methods to run in the background.
https://docs.djangoproject.com/en/dev/howto/deployment/
Dynamically deploying django isn't something I've seen, but I suppose you could write some scripts that generate webserver configuration files.
manage.py runserver should never be used for production environments / uses.
If that was just an example, and you actually want to run other asynchronous management commands, the accepted community answer to to use a task queue like Celery.
http://docs.celeryproject.org/en/latest/django/
You could then fire off 10000 non blocking management commands to be consumed "in the future" at some point by celery workers.
I am writing an API that reads from MySQL and Solr (which can give latencies of 150ms) to provide formatted output. I will be hosting this on a VPS, and I need to choose a web server for this application. It will be used only within localhost (and local LAN in future).
I have these concerns:
Launches multiple worker threads to minimize bottlenecks with consurrent requests (Solr can take 150ms to return a request)
Can easily respawn when a component crashes and restarting is just a matter of servd -restart
deploying a new application is as simple as copying a folder to the www directory (or equivalent) so that new requests to this app will be served from then on.
I am not optimizing for performance for now, so I need something easy to setup. And is #3 not possible for a non-load balanced Django app?
Gunicorn is very simple to deploy and manage. It has no built-in reloading capability but you could easily use an external utility such as watchdog to monitor a directory and reload gunicorn using kill -HUP <pid>.