Django Task/Command Execution Best Practice/Understanding - python

I've got a little problem with understanding the django management commands. I've got an Webapplication which displays some network traffic information through eth0. Therefore I've created a python class which analyse the traffic and create/update the specific data in the database. Something like this:
class Analyzer:
def doSomething(self):
#analyze the traffic create/update data in db
def startAnalyzing(self):
while 1:
self.doSomething()
Then I create a management command which creates this class instance and runs startAnalyzing().
Now my question:
Is this the correct way to do that over management command because the task is not terminating (run the whole time) and not started/stopped via webapplication? Or what is the correct way?
Is it probably better to start the "Analyzer" not via django? Im new to django and wan't to do it the right way.
Is it possible to start sniffing the traffic when i run: manage.py runserver 0.0.0.0:8080?
Many thanks in advance.

What you're doing is not intended to do with management commands. In fact management commands are what the name implies, a command to manage something, do a quick action. Not keep a whole process running for the entire life time of the web app.
To achieve what you want, you should write a simple python script and keep it running with a process manager (supervisor ?). You just then have to setup django in the beginning of the script so can have access to Django's ORM, which probably is the reason you've chosen Django.
So all in all, you're script would look something like the following:
import sys, os
sys.path.insert(0, "/path/to/parent/of/project") # /home/projects/django-proj
os.environ.setdefault("DJANGO_SETTINGS_MODULE", 'proj.settings')
import django
django.setup()
from proj.app.models import DBModel
This way you can use django's ORM as you would use in a normal Django application. You can also provide templates and views of the Database as you normally would.
The only thing that remains is to keep the script running, and that you can simply do with supervisord.

Related

Django does not exit after finishing tests when using threading module

I am doing integration tests using django.test.SimpleTestCase.
After running python manage.py test, the tests run successfully and the terminal hangs with the message:
---------------------------
Ran 5 tests in 1.365s
OK
The problem is that currently I go back to the terminal using CTRL+C, but I want to have automated tests in my CI/CD pipeline.
Did I do something wrong in the way I executed the tests? Or is this behaviour normal? In this case, is there an way in Bash to programatically execute and then exit the tests?
EDIT:
After analysing my app in depth, I was able to identify what was causing that behaviour. I am using threading in a way like the following in my views.py:
def __pooling():
wait_time = 10
call_remote_server()
threading.Timer(wait_time, __pooling).start()
__pooling()
Basically I need that my application do something from time to time asynchronously.
Should I change the way I am doing the pooling? Or should I disable it (how?) during the tests?
I was able to identify what was causing that behaviour. I am using threading in a way like the following in my views.py:
def __pooling():
wait_time = 10
call_remote_server()
threading.Timer(wait_time, __pooling).start()
__pooling()
Basically I need that my application do something from time to time
asynchronously. Should I change the way I am doing the pooling?
I don't fully understand your needs, but a more traditional approach would be to schedule a task (probably a management command) outside of Django itself. An OS-level scheduler like cron or Windows Task Scheduler, something like APScheduler, or a task queue like Celery would all be reasonable choices.
Or should I disable it (how?) during the tests?
I don't recommend continuing to use your __pooling() function as it exists today. In my opinion this kind of thing doesn't belong in your views.py. But if you want to keep it, something like
from django.conf import settings
if not settings.DEBUG:
__pooling()
might help. Your __pooling() function would only be called when DEBUG is falsy, as it should be in production. (If it is also falsy in your CI environment you could choose another existing setting, or add something to your settings.py specifically to control this.)

Running python bot in django background

I've got a django project with simple form to take users details in. I want to use python bot running in the background and constantly checking django database for any changes. Is it Celery the right tool for this job? Any other solution? Thank you
I don't think Celery is really what you want here - Celery is primarily for moving tasks that don't need to be dealt with in the same process to a separate worker, such as sending registration emails.
For this situation I'd be inclined to use Django's signals to trigger the required functionality whenever the appropriate changes are made to the database. For instance, if it needed to be triggered when a particular type of object was created, such as a new user, then you might use the post_save signal of the user model.
The bot would be in a separate process, but it's not too hard to communicate between processes using Redis. Just have the signal publish a message to Redis, and have the bot listen for that message and carry out the required action on that event.
I don't have the details of your needs but, there are a few ways to achieve such things:
The Constantly checking approach:
A crontab which launch your python script every minute.
Like you said, you could use Celery beat, to achieve what a crontab would do, in your python environment
"On change" approach:
Probably the best, if you have control of the Django project, you could have your script run on the form validation/save! For this, You can add a celery task, run the python script, use Django signals...

Run code on first Django start

I have a Django application written to handle displaying a webpage with data from a model based on the primary key passed in the URL, this all works fine and the Django component is working perfectly for the most part.
My question though is, and I have tried multiple methods such as using an AppConfig, is how I can make it so when the Django server boots up, code is called that would then create a separate thread which would then monitor an external source, logging valid data from that source as a model into the database.
I have the threading code written along with the section that creates the model and saves it in the database, my issue though is that if I try to use an AppConfig to create the thread which would then handle the code, I get an django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet. error and the server does not boot up.
Where would be appropriate to place the code? Is my approach incorrect to the matter?
Trying to use threading to get around blocking processes like web servers is an exercise in pain. I've done it before and it's fragile and often yields unpredictable results.
A much easier idea is to create a separate worker that runs in a totally different process that you start separately. It would have the same database access and could even use your Django models. This is how hosts like Heroku approach this problem. It comes with the added benefit of being able to be tested separately and doesn't need to run at all while you're working on your main Django application.
These days, with a multitude of virtualization options like Vagrant and containerization options like Docker, running parallel processes and workers is trivial. In the wild they may literally be running on separate servers with your database on yet another server. As was mentioned in the comments, starting a worker process could easily be delegated to a separate Django management command. This, in turn, can be fairly easily turned into separate worker processes by gunicorn on your web server.

How can I ensure a Celery task runs with the right settings?

I have two sites running essentially the same codebase, with only slight differences in settings. Each site is built in Django, with a WordPress blog integrated.
Each site needs to import blog posts from WordPress and store them in the Django database. When a user publishes a post, WordPress hits a webhook URL on the Django side, which kicks off a Celery task that grabs the JSON version of the post and imports it.
My initial thought was that each site could run its own instance of manage.py celeryd, each is in its own virtualenv, and the two sites would stay out of each other's way. Each is daemonized with a separate upstart script.
But it looks like they're colliding somehow. I can run one at a time successfully, but if both are running, one instance won't receive tasks, or tasks will run with the wrong settings (in this case, each has a WORDPRESS_BLOG_URL setting).
I'm using a Redis queue, if that makes a difference. What am I doing wrong here?
Have you specified the name of the default queue that celery should use? If you haven't set CELERY_DEFAULT_QUEUE the both sites will be using the same queue and getting each other's messages. You need to set this setting to a different value for each site to keep the message separate.
Edit
You're right, CELERY_DEFAULT_QUEUE is only for backends like RabbitMQ. I think you need to set a different database number for each site, using a different number at the end of your broker url.
If you are using django-celery then make sure you don't have an instance of celery running outside of your virtualenvs. Then start the celery instance within your virtualenvs using manage.py celeryd like you have done. I recommend setting up supervisord to keep track of your instances.

Where should I place the one-time operation operation in the Django framework?

I want to perform some one-time operations such as to start a background thread and populate a cache every 30 minutes as initialize action when the Django server is started, so it will not block user from visiting the website. Where should I place all this code in Django?
Put them into the setting.py file does not work. It seems it will cause a circular dependency.
Put them into the __init__.py file does not work. Django server call it many times (What is the reason?)
I just create standalone scripts and schedule them with cron. Admittedly it's a bit low-tech, but It Just Works. Just place this at the top of a script in your projects top-level directory and call as needed.
#!/usr/bin/env python
from django.core.management import setup_environ
import settings
setup_environ(settings)
from django.db import transaction
# random interesting things
# If you change the database, make sure you use this next line
transaction.commit_unless_managed()
We put one-time startup scripts in the top-level urls.py. This is often where your admin bindings go -- they're one-time startup, also.
Some folks like to put these things in settings.py but that seems to conflate settings (which don't do much) with the rest of the site's code (which does stuff).
For one operation in startserver, you can use customs commands or if you want a periodic task or a queue of taske you can use celery
__init__.py will be called every time the app is imported. So if you're using mod_wsgi with Apache for instance with the prefork method, then every new process created is effectively 'starting' the project thus importing __init__.py. It sounds like your best method would be to create a new management command, and then cron that up to run every so often if that's an option. Either that, or run that management command before starting the server. You could write up a quick script that runs that management command and then starts the server for instance.

Categories

Resources