Celery - schedule periodic tasks starting at a specific time - python

What is the best way to schedule a periodic task starting at specific datetime?
(I'm not using cron for this considering I've the need to schedule about a hundred remote rsyncs,
where I compute the remote vs local offset and would need to rsync each path the second the logs are generated in each host.)
By my understanding the celery.task.schedules crontab class only allows specifying hour, minute, day of week.
The most useful tip I've found so far was this answer by nosklo.
Is this the best solution?
Am I using the wrong tool for the job?

Celery seems like a good solution for your scheduling problem: Celery's PeriodicTasks have run time resolution in seconds.
You're using an appropriate tool here, but the crontab entry is not what you want. You want to use python's datetime.timedelta object; the crontab scheduler in celery.schedules has only minute resolution, but using timedelta's to configure the PeriodicTask interval provides strictly more functionality, in this case, per second resolution.
e.g. from the Celery docs
>>> from celery.task import tasks, PeriodicTask
>>> from datetime import timedelta
>>> class EveryThirtySecondsTask(PeriodicTask):
... run_every = timedelta(seconds=30)
...
... def run(self, **kwargs):
... logger = self.get_logger(**kwargs)
... logger.info("Execute every 30 seconds")
http://ask.github.com/celery/reference/celery.task.base.html#celery.task.base.PeriodicTask
class datetime.timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)
The only challenge here is that you have to describe the frequency with which you want this task to run rather than at what clock time you want it to run; however, I would suggest you check out the Advanced Python Scheduler http://packages.python.org/APScheduler/
It looks like Advanced Python Scheduler could easily be used to launch normal (i.e. non Periodic) Celery tasks at any schedule of your choosing using it's own scheduling functionality.

I've recently worked on a task that involved Celery, and I had to use it for asynchronous operation as well as scheduled tasks. Suffice to say I resorted back to the old crontab for the scheduled task, although it calls a python script that spawns a separate asynchronous task. This way I have less to maintain for the crontab (to make the Celery scheduler run there needs some further setup), but I am making full use of Celery's asynchronous capabilities.

Related

Web2py scheduler - Best practices to rerun task continuously and to add task at startup

I want to add a task to the queue at app startup, currently adding a scheduler.queue_task(...) to the main db.py file. This is not ideal as I had to define the task function in this file.
I also want the task to repeat every 2 minutes continuously.
I would like to know what is the best practices for this?
As stated in web2py doc, to rerun task continuously, you just have to specify it at task queuing time :
scheduler.queue_task(your_function,
pargs=your_args,
timeout = 120, # just in case
period=120, # as you want to run it every 2 minutes
immediate=True, # starts task ASAP
repeats=0 # just does the infinite repeat magic
)
To queue it at startup, you might want to use web2py cron feature this simple way:
#reboot root *your_controller/your_function_that_calls_queue_task
Do not forget to enable this feature (-Y, more details in the doc).
There is no real mechanism for this within web2py it seems.
There are a few hacks one could do to continuously repeat tasks or schedule at startup but as far as I can see the web2py scheduler needs alot of work.
Best option is to just abondon this web2py feature and use celery or similar for advanced usage.

How to do a job every half an hour

I want to do a job every half an hour. My application is based on Flask, and running on Windows.
Now I create a task for the job, using Windows scheduler service.
I want to know if there is another way I can do cyclic tasks using Flask’s built-in functions...
Sorry for my poor English.
I want to know if there is another way I can do [periodic] tasks using Flask’s built-in functions.
Being somewhat a minimalist microframework, I don't think Flask has or ever will have a built-in feature to schedule periodic tasks.
The customary way is what you have already done, you write some Flask code that can be called as an HTTP endpoint or a script, then use a OS scheduling tool to execute it (e.g. Windows Task Scheduler or cron on UNIX/Linux)
Otherwise, Flask works well with other specialized libraries that take care of this, like Celery (periodic tasks) that takes care of those details and adds some features that may not be available otherwise.
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
'every-half-hour': {
'task': 'tasks.your_task',
'schedule': timedelta(minutes=30),
'args': ('task_arg1', 'task_arg2')
},
}
CELERY_TIMEZONE = 'UTC'
I m not sure if this help, but I've been testing the schedule module and it's easy to use and it works well:
$pip install schedule
and this is a sample from the official documentation:
import schedule
import time
def job():
print("I'm working...")
schedule.every(30).minutes.do(job)
while True:
schedule.run_pending()
time.sleep(1)
Hope this help =)

Celery PeriodicTask per user

I'm working on project which main future will be running periodically one type of async task for each user. Every user will be able to configure task (running daily, weekly etc. at specified time). Also task will use some data stored by user. Now I'm wondering which approach should be better: allow users to create own PeriodicTask (by using some restricted endpoint of course) or create single PeriodicTask (for example running every 5 minutes) which will iterate over all users and determine if task should be queued or not for current user? I think I will use AMPQ as broker.
periodic tasks scheduler in celery is not designed to handle thousands of scheduled tasks, so from performance perspective, much better solution is to have one task that is running at the smallest interval (e.g. if you allow user to sechedule dayly, weekly, monthly - running task daily is enough)
such approach is as well more stable - every time schedule changes, all of the schedule records are reloaded
plus is more secure because you do not expose or use any internal mechanisms for tasks execution

How to dynamically add a scheduled task to Celery beat

Using Celery ver.3.1.23, I am trying to dynamically add a scheduled task to celery beat. I have one celery worker and one celery beat instance running.
Triggering a standard celery task y running task.delay() works ok. When I define a scheduled periodic task as a setting in configuration, celery beat runs it.
However what I need is to be able to add a task that runs at specified crontab at runtime. After adding a task to persistent scheduler, celery beat doesn't seem to detect the newly added new task. I can see that the celery-schedule file does have an entry with new task.
Code:
scheduler = PersistentScheduler(app=current_app, schedule_filename='celerybeat-schedule')
scheduler.add(name="adder",
task="app.tasks.add",
schedule=crontab(minute='*/1'),
args=(1,2))
scheduler.close()
When I run:
print(scheduler.schedule)
I get:
{'celery.backend_cleanup': <Entry: celery.backend_cleanup celery.backend_cleanup() <crontab: 0 4 * * * (m/h/d/dM/MY)>,
'adder': <Entry: adder app.tasks.add(1, 2) <crontab: */1 * * * * (m/h/d/dM/MY)>}
​
Note that app.tasks.add has the #celery.task decorator.
Instead of trying to find a good workaround, I suggest you switch to the Celery Redbeat.
You may solve your problem by enabling autoreloading.
However I'm not 100% sure it will work for your config file but it should if is in the CELERY_IMPORTS paths.
Hoverer note that this feature is experimental and to don't be used in production.
If you really want to have dynamic celerybeat scheduling you can always use another scheduler like the django-celery one to manage periodic tasks on db via a django admin.
I'm having a similar problem and a solution I thought about is to pre-define some generic periodic tasks (every 1s, every 5mins, etc) and then have them getting, from DB, a list of function to be executed.
Every time you want to add a new task you just add an entry in your DB.
Celery beat stores all the periodically scheduled tasks in the model PeriodicTask . As a beat task can be scheduled in different ways including crontab, interval or solar. All these fields are a foreign key in the PeriodicTask model.
In order to dynamically add a scheduled task, just populate the relevant models in celery beat, the scheduler will detect changes. The changes are detected when either the count of tuple changes or save() function is called.
from django_celery_beat.models import PeriodicTask, CrontabSchedule
# -- Inside the function you want to add task dynamically
schedule = CrontabSchedule.objects.create(minute='*/1')
task = PeriodicTask.objects.create(name='adder',
task='apps.task.add', crontab=schedule)
task.save()

Function which executes when the date changes

I need a function to execute every time the date changes. Currently I'm checking in a loop to see if the date changed, but I'm looking for a more effective method....in Python
Any help appreciated
What you really want to do is schedule a function to be run at a certain time. You need to do this with a scheduling mechanism. You could, of course, write one yourself, but probably the best way to go would be to use a library that does this for you.
APScheduler is a very mature good library for just this sort of thing.
Docs: http://apscheduler.readthedocs.org/en/latest/
Pypi: https://pypi.python.org/pypi/APScheduler/3.0.0
Example
Here is a quick little example
from apscheduler.schedulers.background import BlockingScheduler
scheduler = BlockingScheduler()
#scheduler.scheduled_job('interval', seconds=5, timezone='UTC')
def hello():
print('Hello!')
scheduler.start()
This will run the function hello every five seconds. You can change seconds=5 to days=1 to have it run once a day. There is much more configuration you can do, so you'll probably want to read the documentation. It is able to express just about any date time format you could want, including cron.
It also supports different types of schedulers, for instance I chose a BlockingScheduler because wanted the entire program to run as a function of the scheduling mechanism (so you could try this out easily on your own system). You can also use, for instance, a BackgroundScheduler which will allow you to schedule tasks from within your program in an efficient manner that will not block the main thread (fixes your going in a loop forever problem).

Categories

Resources