I'm new to Django and web frameworks in general. I have an app that is all set up and works perfectly fine on my localhost.
The program uses Twitter's API to gather a bunch of tweets and displays them to the user. The only problem is I need my python program that gets the tweets to be run in the background every-so-often.
This is where using the schedule module would make sense, but once I start the local server it never runs the schedule functions. I tried reading up on cronjobs and just can't seem to get it to work. How can I get Django to run a specific python file periodically?
I've encountered a similar situation and have had a lot of success with django-apscheduler. It is all self-contained - it runs with the Django server and jobs are tracked in the Django database, so you don't have to configure any external cron jobs or anything to call a script.
Below is a basic way to get up and running quickly, but the links at the end of this post have far more documentation and details as well as more advanced options.
Install with pip install django-apscheduler then add it to your INSTALLED_APPS:
INSTALLED_APPS = [
...
'django_apscheduler',
...
]
Once installed, make sure to run makemigrations and migrate on the database.
Create a scheduler python package (a folder in your app directory named scheduler with a blank __init__.py in it). Then, in there, create a file named scheduler.py, which should look something like this:
from apscheduler.schedulers.background import BackgroundScheduler
from django_apscheduler.jobstores import DjangoJobStore, register_events
from django.utils import timezone
from django_apscheduler.models import DjangoJobExecution
import sys
# This is the function you want to schedule - add as many as you want and then register them in the start() function below
def deactivate_expired_accounts():
today = timezone.now()
...
# get accounts, expire them, etc.
...
def start():
scheduler = BackgroundScheduler()
scheduler.add_jobstore(DjangoJobStore(), "default")
# run this job every 24 hours
scheduler.add_job(deactivate_expired_accounts, 'interval', hours=24, name='clean_accounts', jobstore='default')
register_events(scheduler)
scheduler.start()
print("Scheduler started...", file=sys.stdout)
In your apps.py file (create it if it doesn't exist):
from django.apps import AppConfig
class AppNameConfig(AppConfig):
name = 'your_app_name'
def ready(self):
from scheduler import scheduler
scheduler.start()
A word of caution: when using this with DEBUG = True in your settings.py file, run the development server with the --noreload flag set (i.e. python manage.py runserver localhost:8000 --noreload), otherwise the scheduled tasks will start and run twice.
Also, django-apscheduler does not allow you to pass any parameters to the functions that are scheduled to be run. It is a limitation, but I've never had a problem with it. You can load them from some external source, like the Django database, if you really need to.
You can use all the standard Django libraries, packages and functions inside the apscheduler tasks (functions). For example, to query models, call external APIs, parse responses/data, etc. etc. It's seamlessly integrated.
Some additional links:
Project repository: https://github.com/jarekwg/django-apscheduler
More documentation:
https://medium.com/#mrgrantanderson/replacing-cron-and-running-background-tasks-in-django-using-apscheduler-and-django-apscheduler-d562646c062e
Another library you can use is django-q
Django Q is a native Django task queue, scheduler and worker application using Python multiprocessing. 1
Like django-appscheduler it can run and track jobs using the database Django is attached to. Or, it can use full-blown brokers like Reddis.
The only problem is I need my python program that gets the tweets to be run in the background every-so-often.
That sounds like a scheduler. (Django-q also has a tasks feature, that can be triggered by events rather than being run on a schedule. The scheduler just sits on top of the task feature, and triggers tasks at a defined schedule.)
There's three parts to this with django-q:
Install Django-q and configure it;
Define a task function (or set of functions) that you want to fetch the tweets;
Define a schedule that runs the tasks;
Run the django-q cluster that'll process the schedule and tasks.
Install django-q
pip install django-q
Configure it as an installed app in Django settings.py (add it to the install apps list):
INSTALLED_APPS = [
...
'django_q',
...
]
Then it needs it's own configuration settings.py (this is a configuration to use the database as the broker rather than reddis or something external to Django.)
# Settings for Django-Q
# https://mattsegal.dev/simple-scheduled-tasks.html
Q_CLUSTER = {
'orm': 'default', # should use django's ORM and database as a broker.
'workers': 4,
'timeout': 30,
'retry': 60,
'queue_limit': 50,
'bulk': 10,
}
You'll then need to run migrations on the database to create the tables django-q uses:
python manage.py migrate
(This will create a bunch of schedule and task related tables in the database. They can be viewed and manipulated through the Django admin panel.)
Define a task function
Then create a new file for the tasks you want to run:
# app/tasks.py
def fetch_tweets():
pass # do whatever logic you want here
Define a task schedule
We need to add into the database the schedule to run the tasks.
python manage.py shell
from django_q.models import Schedule
Schedule.objects.create(
func='app.tasks.fetch_tweets', # module and func to run
minutes=5, # run every 5 minutes
repeats=-1 # keep repeating, repeat forever
)
You don't have to do this through the shell. You can do this in a module of python code, etc. But you probably only need to create the schedule once.
Run the cluster
Once that's all done, you need to run the cluster that will process the schedule. Otherwise, without running the cluster, the schedule and tasks will never be processed. The call to qcluster is a blocking call. So normally you want to run it in a separate window or process from the Django server process.
python manage.py qcluster
When it runs you'll see output like:
09:33:00 [Q] INFO Q Cluster fruit-november-wisconsin-hawaii starting.
09:33:00 [Q] INFO Process-1:1 ready for work at 11
09:33:00 [Q] INFO Process-1:2 ready for work at 12
09:33:00 [Q] INFO Process-1:3 ready for work at 13
09:33:00 [Q] INFO Process-1:4 ready for work at 14
09:33:00 [Q] INFO Process-1:5 monitoring at 15
09:33:00 [Q] INFO Process-1 guarding cluster fruit-november-wisconsin-hawaii
09:33:00 [Q] INFO Q Cluster fruit-november-wisconsin-hawaii running.
There's also some example documentation that's pretty useful if you want to see how to hook up tasks to reports or emails or signals etc.
Related
I am trying to schedule a function to run everyday using the Schedule
library.
My local Django server however hangs and becomes unresponsive during the system check after saving the schedules to my code. It is only when I remove the schedule code the system check passes and the server runs without problem.
I have copied the example directly from the documentation and the server is not returning any errors so I am unsure what the problem is ..
views.py
....
def test_task():
user = user.objects.get(pk=1)
user.task_complete = True
user.save()
schedule.every(10).minutes.do(test_task)
while True:
schedule.run_pending()
time.sleep(1)
....
Terminal output (hangs here)
chaudim#TD app_one % python3 manage.py runserver
Watching for file changes with StatReloader
Performing system checks...
Django loads (imports) files based on its settings.
When you put this while loop in a global scope, it is executed on import. It runs the while loop until it's done. And it's never done. You can add a print statement there if you want to see for yourself if that's the root cause.
Normally people use periodic_tasks from celery but it might be an overkill for your needs.
I'd rather advise to create a command so you could run python manage.py test_task and on the os level just add a cron job that will run this command every 10 minutes.
There is a specific periodic task that needs to be removed from message queue. I am using the configuration of Redis and celery here.
tasks.py
#periodic_task(run_every=crontab(minute='*/6'))
def task_abcd():
"""
some operations here
"""
There are other periodic tasks also in the project but I need to stop this specific task to stop from now on.
As explained in this answer, the following code will work?
#periodic_task(run_every=crontab(minute='*/6'))
def task_abcd():
pass
In this example periodic task schedule is defined directly in code, meaning it is hard-coded and cannot be altered dynamically without code change and app re-deploy.
The provided code with task logic deleted or with simple return at the beginning - will work, but will not be the answer to the question - task will still run, there just is no code that will run with it.
Also, it is recommended NOT to use #periodic_task:
"""Deprecated decorator, please use :setting:beat_schedule."""
so it is not recommended to use it.
First, change method from being #periodic_task to just regular celery #task, and because you are using Django - it is better to go straightforward for #shared_task:
from celery import shared_task
#shared_task
def task_abcd():
...
Now this is just one of celery tasks, which needs to be called explicitly. Or it can be run periodically if added to celery beat schedule.
For production and if using multiple workers it is not recommended to run celery worker with embedded beat (-B) - run separate instance of celery beat scheduler.
Schedule can specified in celery.py or in django project settings (settings.py).
It is still not very dynamic, as to re-read settings app needs to be reloaded.
Then, use Database Scheduler which will allow dynamically creating schedules - which tasks need to be run and when and with what arguments. It even provides nice django admin web views for administration!
That code will work but I'd go for something that doesn't force you to update your code every time you need to disable/enable the task.
What you could do is to use a configurable variable whose value could come from an admin panel, a configuration file, or whatever you want, and use that to return before your code runs if the task is in disabled mode.
For instance:
#periodic_task(run_every=crontab(minute='*/6'))
def task_abcd():
config = load_config_for_task_abcd()
if not config.is_enabled:
return
# some operations here
In this way, even if your task is scheduled, its operations won't be executed.
If you simply want to remove the periodic task, have you tried to remove the function and then restart your celery service. You can restart your Redis service as well as your Django server for safe measure.
Make sure that the function you removed is not referenced anywhere else.
I need a thread running when I start my django server, basically the thread just periodically processes some items from a database.
Where is the best place to start this thread.
I think this is generally a bad idea. You shouldn't have that kind of periodic threads running in the frontend process.
I would create a management command that will do the processing. Then I would set up a cron job (or any other mechanic provided by the hosting) calling the management command. This way you divide the work to logic places and you can also test the processing much easier.
You want to execute code in the top-level urls.py. That module is imported and executed once on server startup.
in your urls.py
from django.confs.urls.defaults import *
from my_app import one_time_startup
urlpatterns = ...
one_time_startup() # This is your function that you want to execute.
I am working on a Django web based project in which i need to build a application which work in the following sequence:
1) user open a page in which he need to enter a command and a time
2) Django application will execute that command at a given time on each day till user off the scheduler (by default it is True)
What i am facing the problem is that :
1) How should i execute the commands on a time but on each day. To save the commands and time i created a following model in my models.py
class commands(models.Model):
username = models.ForeignKey(User)
command = models.CharField(max_length=30)
execution_time = models.DateField()
I have the same time but i am not getting the right way to execute it on each day at the given time
and is it possible to do with pytz library?
For executing the commands i am using paramiko library
PS: I don't want to use any external library
While you could have your django app add and remove cron jobs on the system, another more django-ish approach would be to use Celery. It is a task queue system that can run both synch and async tasks.
One specific feature of Celery is scheduled tasks: http://packages.python.org/celery/userguide/periodic-tasks.html
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
"runs-every-30-seconds": {
"task": "tasks.add",
"schedule": timedelta(seconds=30),
"args": (16, 16)
},
}
They also have a more granular version of the period task that replicates the scheduling of a crontab:
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {
# Executes every Monday morning at 7:30 A.M
'every-monday-morning': {
'task': 'tasks.add',
'schedule': crontab(hour=7, minute=30, day_of_week=1),
'args': (16, 16),
},
}
Celery by itself is stand-alone but there is the django-celery specific verison
The benefit of this solution is that you do not need to edit and maintain a system-level cron tab. This is a solution that is highly integrated into django for this exact use.
Also a huge win over using a cron is that Celery can scale with your system. If you were using a basic system crontab, then the tasks would be located on the server that hosts the application. But what if you needed to ramp up your site and run it on 5 web application nodes? You would need to centralize that crontab. If you are using Celery, you have a large number of options for how to transport and store tasks. It is inherently distributed, and available in sync to all your application servers. It is portable.
It seems to me that the proper way to do this would be write a Django custom command and execute it via cron. But you seem to be under luck as others have felt similar need and have written custom django apps. Take django-cron for example.
The solution for your problem is standard cron application (task planner on *nix systems). You can schedule a script using cron (by adding it to crontab).
If your script must perform in you Django application environment, it's possible to tell him to do that with setup_environment function. You can read more about standalone scripts for Django applications here.
I want to perform some one-time operations such as to start a background thread and populate a cache every 30 minutes as initialize action when the Django server is started, so it will not block user from visiting the website. Where should I place all this code in Django?
Put them into the setting.py file does not work. It seems it will cause a circular dependency.
Put them into the __init__.py file does not work. Django server call it many times (What is the reason?)
I just create standalone scripts and schedule them with cron. Admittedly it's a bit low-tech, but It Just Works. Just place this at the top of a script in your projects top-level directory and call as needed.
#!/usr/bin/env python
from django.core.management import setup_environ
import settings
setup_environ(settings)
from django.db import transaction
# random interesting things
# If you change the database, make sure you use this next line
transaction.commit_unless_managed()
We put one-time startup scripts in the top-level urls.py. This is often where your admin bindings go -- they're one-time startup, also.
Some folks like to put these things in settings.py but that seems to conflate settings (which don't do much) with the rest of the site's code (which does stuff).
For one operation in startserver, you can use customs commands or if you want a periodic task or a queue of taske you can use celery
__init__.py will be called every time the app is imported. So if you're using mod_wsgi with Apache for instance with the prefork method, then every new process created is effectively 'starting' the project thus importing __init__.py. It sounds like your best method would be to create a new management command, and then cron that up to run every so often if that's an option. Either that, or run that management command before starting the server. You could write up a quick script that runs that management command and then starts the server for instance.