I am trying to schedule a function to run everyday using the Schedule
library.
My local Django server however hangs and becomes unresponsive during the system check after saving the schedules to my code. It is only when I remove the schedule code the system check passes and the server runs without problem.
I have copied the example directly from the documentation and the server is not returning any errors so I am unsure what the problem is ..
views.py
....
def test_task():
user = user.objects.get(pk=1)
user.task_complete = True
user.save()
schedule.every(10).minutes.do(test_task)
while True:
schedule.run_pending()
time.sleep(1)
....
Terminal output (hangs here)
chaudim#TD app_one % python3 manage.py runserver
Watching for file changes with StatReloader
Performing system checks...
Django loads (imports) files based on its settings.
When you put this while loop in a global scope, it is executed on import. It runs the while loop until it's done. And it's never done. You can add a print statement there if you want to see for yourself if that's the root cause.
Normally people use periodic_tasks from celery but it might be an overkill for your needs.
I'd rather advise to create a command so you could run python manage.py test_task and on the os level just add a cron job that will run this command every 10 minutes.
Related
I have a script of a bot deployed on azure that has to be always running. it's a python bot that tracks Twitter mentions in real time by opening a stream listener.
The script fails every once in a while for reasons not directly related to the script (timeouts, connection errors, etc). After searching for answers around here I found this piece of code as the best workaround for restarting the script every time it fails.
#!/usr/bin/env python3.7
import os
def run_bot():
while True:
try:
os.system("test_bot.py start")
except:
pass
if __name__ == "__main__":
run_bot()
I am logging all error messages to learn the reasons why it fails but I think there just be a better way to achieve the same, I would very much appreciate some hints.
So this is wrong way to run a script, you are running it in while loop forever.
A better way is either to schedule your main script in a cron job: Execute Python script via crontab
You can schedule this job to run every min, or hour or a specific time of the day, up to you.
If you wish to run something always, like a system monitor. Then you can run that part inside a while True loop that is fine. Like a loop which checks temperature every 5 secs writes to a file and sleep for 5 secs.
sample sudo code for the script: prog.py
while True:
log_temp()
time.sleep(5secs)
But if the script fails then schedule something to restart the script. Dont start the script inside another while loop.
Something like this: https://unix.stackexchange.com/questions/107939/how-to-restart-the-python-script-automatically-if-it-is-killed-or-dies
I'm new to Django and web frameworks in general. I have an app that is all set up and works perfectly fine on my localhost.
The program uses Twitter's API to gather a bunch of tweets and displays them to the user. The only problem is I need my python program that gets the tweets to be run in the background every-so-often.
This is where using the schedule module would make sense, but once I start the local server it never runs the schedule functions. I tried reading up on cronjobs and just can't seem to get it to work. How can I get Django to run a specific python file periodically?
I've encountered a similar situation and have had a lot of success with django-apscheduler. It is all self-contained - it runs with the Django server and jobs are tracked in the Django database, so you don't have to configure any external cron jobs or anything to call a script.
Below is a basic way to get up and running quickly, but the links at the end of this post have far more documentation and details as well as more advanced options.
Install with pip install django-apscheduler then add it to your INSTALLED_APPS:
INSTALLED_APPS = [
...
'django_apscheduler',
...
]
Once installed, make sure to run makemigrations and migrate on the database.
Create a scheduler python package (a folder in your app directory named scheduler with a blank __init__.py in it). Then, in there, create a file named scheduler.py, which should look something like this:
from apscheduler.schedulers.background import BackgroundScheduler
from django_apscheduler.jobstores import DjangoJobStore, register_events
from django.utils import timezone
from django_apscheduler.models import DjangoJobExecution
import sys
# This is the function you want to schedule - add as many as you want and then register them in the start() function below
def deactivate_expired_accounts():
today = timezone.now()
...
# get accounts, expire them, etc.
...
def start():
scheduler = BackgroundScheduler()
scheduler.add_jobstore(DjangoJobStore(), "default")
# run this job every 24 hours
scheduler.add_job(deactivate_expired_accounts, 'interval', hours=24, name='clean_accounts', jobstore='default')
register_events(scheduler)
scheduler.start()
print("Scheduler started...", file=sys.stdout)
In your apps.py file (create it if it doesn't exist):
from django.apps import AppConfig
class AppNameConfig(AppConfig):
name = 'your_app_name'
def ready(self):
from scheduler import scheduler
scheduler.start()
A word of caution: when using this with DEBUG = True in your settings.py file, run the development server with the --noreload flag set (i.e. python manage.py runserver localhost:8000 --noreload), otherwise the scheduled tasks will start and run twice.
Also, django-apscheduler does not allow you to pass any parameters to the functions that are scheduled to be run. It is a limitation, but I've never had a problem with it. You can load them from some external source, like the Django database, if you really need to.
You can use all the standard Django libraries, packages and functions inside the apscheduler tasks (functions). For example, to query models, call external APIs, parse responses/data, etc. etc. It's seamlessly integrated.
Some additional links:
Project repository: https://github.com/jarekwg/django-apscheduler
More documentation:
https://medium.com/#mrgrantanderson/replacing-cron-and-running-background-tasks-in-django-using-apscheduler-and-django-apscheduler-d562646c062e
Another library you can use is django-q
Django Q is a native Django task queue, scheduler and worker application using Python multiprocessing. 1
Like django-appscheduler it can run and track jobs using the database Django is attached to. Or, it can use full-blown brokers like Reddis.
The only problem is I need my python program that gets the tweets to be run in the background every-so-often.
That sounds like a scheduler. (Django-q also has a tasks feature, that can be triggered by events rather than being run on a schedule. The scheduler just sits on top of the task feature, and triggers tasks at a defined schedule.)
There's three parts to this with django-q:
Install Django-q and configure it;
Define a task function (or set of functions) that you want to fetch the tweets;
Define a schedule that runs the tasks;
Run the django-q cluster that'll process the schedule and tasks.
Install django-q
pip install django-q
Configure it as an installed app in Django settings.py (add it to the install apps list):
INSTALLED_APPS = [
...
'django_q',
...
]
Then it needs it's own configuration settings.py (this is a configuration to use the database as the broker rather than reddis or something external to Django.)
# Settings for Django-Q
# https://mattsegal.dev/simple-scheduled-tasks.html
Q_CLUSTER = {
'orm': 'default', # should use django's ORM and database as a broker.
'workers': 4,
'timeout': 30,
'retry': 60,
'queue_limit': 50,
'bulk': 10,
}
You'll then need to run migrations on the database to create the tables django-q uses:
python manage.py migrate
(This will create a bunch of schedule and task related tables in the database. They can be viewed and manipulated through the Django admin panel.)
Define a task function
Then create a new file for the tasks you want to run:
# app/tasks.py
def fetch_tweets():
pass # do whatever logic you want here
Define a task schedule
We need to add into the database the schedule to run the tasks.
python manage.py shell
from django_q.models import Schedule
Schedule.objects.create(
func='app.tasks.fetch_tweets', # module and func to run
minutes=5, # run every 5 minutes
repeats=-1 # keep repeating, repeat forever
)
You don't have to do this through the shell. You can do this in a module of python code, etc. But you probably only need to create the schedule once.
Run the cluster
Once that's all done, you need to run the cluster that will process the schedule. Otherwise, without running the cluster, the schedule and tasks will never be processed. The call to qcluster is a blocking call. So normally you want to run it in a separate window or process from the Django server process.
python manage.py qcluster
When it runs you'll see output like:
09:33:00 [Q] INFO Q Cluster fruit-november-wisconsin-hawaii starting.
09:33:00 [Q] INFO Process-1:1 ready for work at 11
09:33:00 [Q] INFO Process-1:2 ready for work at 12
09:33:00 [Q] INFO Process-1:3 ready for work at 13
09:33:00 [Q] INFO Process-1:4 ready for work at 14
09:33:00 [Q] INFO Process-1:5 monitoring at 15
09:33:00 [Q] INFO Process-1 guarding cluster fruit-november-wisconsin-hawaii
09:33:00 [Q] INFO Q Cluster fruit-november-wisconsin-hawaii running.
There's also some example documentation that's pretty useful if you want to see how to hook up tasks to reports or emails or signals etc.
I have a task to write a Python script which has to parse a web-page once a week. I wrote the script but do not know how can I make it to work once a week. Could someone share an advice and write possible solution?
Have a look at cron. Its not python, but fits the job much better in my opinion. For example:
#weekly python path/to/your/script
A similar question was discussed here.
Whether the script itself should repeat a task periodically usually depends on how frequently the task should repeat. Once a week is usually better left to a scheduling tool like cron or at.
However, a simple method inside the script is to wrap your main logic in a loop that sleeps until the next desired starting time, then let the script run continuously. (Note that a script cannot reliably restart itself, or showing how to do so is beyond the scope of this question. Prefer an external solution.)
Instead of
def main():
...
if __name__ == '__main__':
main()
use
import tim
one_week = 7 * 24 * 3600 # Seconds in a week
def main():
...
if __name__ == '__main__':
while True:
start = time.time()
main()
stop = time.time()
elapsed = stop - start
time.sleep(one_week - elapsed)
Are you planning to run it locally? Are you working with a virtual environment?
Task scheduler option
If you are running it locally, you can use Task scheduler from Windows. Setting up the task can be a bit tricky I found, so here is an overview:
Open Task Scheduler > Create Task (on right actions menu)
In tab "General" name the task
In tab "Triggers" define your triggers (i.e. when you want to schedule the tasks)
In tab "Actions" press on new > Start a program. Under Program/script point to the location (full path) of your python executable (python.exe). If you are working with a virtual environment it is typically in venv\Scripts\python.exe. The full path would be C:\your_workspace_folder\venv\Scripts\python.exe. Otherwise it will be most likely in your Program Files.
Within the same tab, under Add arguments, enter the full path to your python script. For instance: "C:\your_workspace_folder\main.py" (note that you need the ").
Press Ok and save your task.
Debugging
To test if your schedule works you could right click on the task in Task scheduler and press on Run. However, then you don't see the logs of what is happening. I recommend therefore to open a terminal (eg cmd) and type the following:
C:\your_workspace_folder\venv\Scripts\python.exe "C:\your_workspace_folder\main.py"
This allows you to see the full trace of your code and if its running properly. Typical errors that occur are related to file paths (eg if you are not using the full path but a relative path).
Sleeping mode
It can happen that some of the tasks do not run because you don't have administrator privileges and your computer goes in sleeping mode. What I found as a workaround is to keep the computer from going into sleeping mode using a .vbs script. Simply open notepad and create a new file named idle.vbs (extension should be .vbs so make sure you select all programs). In there paste the following code:
Dim objResult
Set objShell = WScript.CreateObject("WScript.Shell")
Do While True
objResult = objShell.sendkeys("{NUMLOCK}{NUMLOCK}")
Wscript.Sleep (60000)
Loop
I am running a server in Django,which is taking values continuously. The function used forever loop in it, when I call that function it never gets out of the loop.
My problem - I want to take values continuously from the server and use it afterwords wherever I want.
I tried threading, what I thought I could do is make a background task which keeps on feeding the database and when I want to use I can take values from it. But I dont know how to do this
ip = "192.168.1.15"
port = 5005
def eeg_handler(unused_addr, args, ch1, ch2, ch3, ch4, ch5):
a.append(ch1)
print(a)
from pythonosc import osc_server, dispatcher
dispatcher = dispatcher.Dispatcher()
dispatcher.map("/muse/eeg", eeg_handler, "EEG")
server = osc_server.ThreadingOSCUDPServer(
(ip, port), dispatcher)
# print("Serving on {}".format(server.server_address))
server.serve_forever()
You can create a Management command
With a Management command you can acces to your database in the same way you accesss to it through Django.
Then you can schedule this command from cron or you can make this run forever because it will not block your application.
Another guide to write management command.
You can use django-background-tasks, a database-backed worked queue for django. You can follow their installation instructions from here.
A sample background task for your case would be:
from background_task import background
#background(schedule=60)
def feed_database(some_parameter):
# feed your database here
# you can also pass a parameter to this function
pass
All you need is to call feed_database from regular code to activate your background task, which will create a Task object and stores it in the database and run this function after 60 seconds.
In your case you want to run this function infinitely, so you can do something like this:
feed_database(some_parameter, repeat=60, repeat_until=None)
This will run your function once in 60 seconds, infinitely.
They also provide a django management command, where you can give run commands to your tasks (if you don't want to start your task from your code), by using python manage.py process_tasks.
Is there something special that I need to do when working with cron jobs for separated modules? I can't figure out why I can make a request to the cron job at localhost:8083/tasks/crontask (localhost:8083 runs the workers module), which is supposed to just print a simple line, and it doesn't print to the console, although it says that the request was successful if I run it by going to http://localhost:8000/cron and hitting the run button.. but even that still doesn't hit make it print to the console.
If I refresh the page localhost:8083/tasks/crontask as a way of triggering the cron job, it times out.
again, If I go to localhost:8001 and hit the run button, it says request to /tasks/crontask was successful, but it doesn't print to the console like it's supposed to
In send_notifications_handler.py within in workers/handlers directory
class CronTaskHandler(BaseApiHandler):
def get(self):
print "hello, this is a cron job"
in cron.yaml outside the workers module
cron:
- description: something
url: /tasks/crontask
schedule: every 1 minutes
target: workers
in init.py in the workers/handlers directory
from send_notifications_handler import CronTaskHandler
#--- Packaging
__all__ = [
CounterWorker,
DeleteGamesCronHandler,
CelebrityCountsCronTaskHandler,
QuestionTypeCountsCronHandler,
CronTaskHandler
]
in workers/routes.py
Route('/tasks/crontask', handlers.CronTaskHandler, methods=['GET']),
//++++++++++++++++++++ Updates / resolution +++++++++++++
The print statement is fine and does print to the console
Yes, the cron job will fire once under the using the dev server, although it doesn't repeat
The problem was that _ah/start in that module was routed to a pull queue that never stops. removing the pull queue fixed the issue.
That is actually the expected behavior when executing cron jobs locally.
If you take a look to the docs, it says the following:
The development server doesn't automatically run your cron jobs. You can use your local desktop's cron or scheduled tasks interface to trigger the URLs of your jobs with curl or a similar tool.
You will need to manually execute cron jobs on local server by visiting http://localhost:8000/cron, as you mentioned in your post.
/++++++++++++++++++++ Updates / resolution +++++++++++++
The print statement is fine and does print to the console
Yes, the cron job will fire once when using the dev server, although it doesn't repeat, which is normal behavior for dev servers
The problem was that _ah/start in that module was routed to a pull queue that never stops. removing the pull queue fixed the issue.
Thanks for suggestions