Add Repeating Task With Redis - python

How do I schedule a task to run once every six hours (on repeat)?
I am trying to implement a Redis queue for the first time.
I went through Heroku's tutorial : https://devcenter.heroku.com/articles/python-rq
But the tutorial did not explain how to run a task repeatedly with a timeframe (such as checking a couple of websites for info, once every six hours)
Also, since I am new to do this, if I should not be using Redis for such a task, please let me know what I should be using to check a couple of websites for info once every six hours
Thanks

You don't need Redis for this functionality at all.
Take a look at the Heroku Scheduler here: https://devcenter.heroku.com/articles/scheduler
You can set this to run your code every hour, and have your code check if the current hour is 0,5,11,17 (or whatever other interval you may need).

Related

Django run tasks (possibly) in the far future

Suppose I have a model Event. I want to send a notification (email, push, whatever) to all invited users once the event has elapsed. Something along the lines of:
class Event(models.Model):
start = models.DateTimeField(...)
end = models.DateTimeField(...)
invited = models.ManyToManyField(model=User)
def onEventElapsed(self):
for user in self.invited:
my_notification_backend.sendMessage(target=user, message="Event has elapsed")
Now, of course, the crucial part is to invoke onEventElapsed whenever timezone.now() >= event.end.
Keep in mind, end could be months away from the current date.
I have thought about two basic ways of doing this:
Use a periodic cron job (say, every five minutes or so) which checks if any events have elapsed within the last five minutes and executes my method.
Use celery and schedule onEventElapsed using the eta parameter to be run in the future (within the models save method).
Considering option 1, a potential solution could be django-celery-beat. However, it seems a bit odd to run a task at a fixed interval for sending notifications. In addition I came up with a (potential) issue that would (probably) result in a not-so elegant solution:
Check every five minutes for events that have elapsed in the previous five minutes? seems shaky, maybe some events are missed (or others get their notifications send twice?). Potential workaroung: add a boolean field to the model that is set to True once notifications have been sent.
Then again, option 2 also has its problems:
Manually take care of the situation when an event start/end datetime is moved. When using celery, one would have to store the taskID (easy, ofc) and revoke the task once the dates have changed and issue a new task. But I have read, that celery has (design-specific) problems when dealing with tasks that are run in the future: Open Issue on github. I realize how this happens and why it is everything but trivial to solve.
Now, I have come across some libraries which could potentially solve my problem:
celery_longterm_scheduler (But does this mean I cannot use celery as I would have before, because of the differend Scheduler class? This also ties into the possible usage of django-celery-beat... Using any of the two frameworks, is it still possible to queue jobs (that are just a bit longer-running but not months away?)
django-apscheduler, uses apscheduler. However, I was unable to find any information on how it would handle tasks that are run in the far future.
Is there a fundemantal flaw with the way I am approaching this? Im glad for any inputs you might have.
Notice: I know this is likely to be somehwat opinion based, however, maybe there is a very basic thing that I have missed, regardless of what could be considered by some as ugly or elegant.
We're doing something like this in the company i work for, and the solution is quite simple.
Have a cron / celery beat that runs every hour to check if any notification needs to be sent.
Then send those notifications and mark them as done. This way, even if your notification time is years ahead, it will still be sent. Using ETA is NOT the way to go for a very long wait time, your cache / amqp might loose the data.
You can reduce your interval depending on your needs, but do make sure they dont overlap.
If one hour is too huge of a time difference, then what you can do is, run a scheduler every hour. Logic would be something like
run a task (lets call this scheduler task) hourly that gets all notifications that needs to be sent in the next hour (via celery beat) -
Schedule those notifications via apply_async(eta) - this will be the actual sending
Using that methodology would get you both of best worlds (eta and beat)

Django Crontab : How to stop parallel execution

I have few cronjobs running with the help of django-crontab. Let us take one cronjob as an example, suppose this job A is scheduled to run every two minutes.
However, while the job is running and if it is not finished in two minutes, I do not want another instance of this job to execute.
Exploring few resources, I came across this article, but I am not sure where to fit this in.
https://bencane.com/2015/09/22/preventing-duplicate-cron-job-executions/
Did someone already came across this issue? How did you fix it?
According to the readme, you should be able to set:
CRONTAB_LOCK_JOBS = True
in your Django settings. That will prevent a new job instance from starting if a previous one is still running.

Having a function run at random time intervals, web2py

Im currently making a program that would send random text messages at randomly generated times during the day. I first made my program in python and then realized that if I would like other people to sign up to receive messages, I would have to use some sort of online framework. (If anyone knowns a way to use my code in python without having to change it that would be amazing, but for now I have been trying to use web2py) I looked into scheduler but it does not seem to do what I have in mind. If anyone knows if there is a way to pass a time value into a function and have it run at that time, that would be great. Thanks!
Check out the Apscheduler module for cron-like scheduling of events in python - In their example it shows how to schedule some python code to run in a cron'ish way.
Still not sure about the random part though..
As for a web framework that may appeal to you (seeing you are familiar with Python already) you should really look into Django (or to keep things simple just use WSGI).
Best.
I think that actually you can use Scheduler and Tasks of web2py. I've never used it ;) but the documentation describes creation of a task to which you can pass parameters from your code - so something you need - and it should work fine for your needs:
scheduler.queue_task('mytask', start_time=myrandomtime)
So you need web2py's cron job, running every day and firing code similar to the above for each message to be sent (passing parameters you need, possibly message content and phone number, see examples in web2py book). This would be a daily creation of tasks which would be processed later by the scheduler.
You can also have a simpler solution, one daily cron job which prepares the queue of messages with random times for the next day and the second one which runs every, like, ten minutes, checks what awaits to be processed and sends messages. So, no Tasks. This way is a bit ugly though (consider a single processing which takes more then 10 minutes). You may also want to have and check some statuses of the messages to be processed (like pending, ongoing, done) to prevent a situation in which two jobs are working on the same message and to allow tracking progress of the processing. Anyway, you could use the cron method it in an early version of your software and later replace it by a better method :)
In any case, you should check expected number of messages to process and average processing time on your target platform - to make sure that the chosen method is quick enough for your needs.
This is an old question but in case someone is interested, the answer is APScheduler blocking scheduler with jobs set to run in regular intervals with some jitter
See: https://apscheduler.readthedocs.io/en/3.x/modules/triggers/interval.html

Uploading code to server and run automatically

I'm fairly competent with Python but I've never 'uploaded code' to a server before and have it run automatically.
I'm working on a project that would require some code to be running 24/7. At certain points of the day, if a criteria is met, a process is started. For example: a database may contain records of what time each user wants to receive a daily newsletter (for some subjective reason) - the code would at the right time of day send the newsletter to the correct person. But of course, all of this is running out on a Cloud server.
Any help would be appreciated - even correcting my entire formulation of the problem! If you know how to do this in any other language - please reply with your solutions!
Thanks!
Here are two approaches to this problem, both of which require shell access to the cloud server.
Write the program to handle the scheduling itself. For example, sleep and wake up every few miliseconds to perform the necessary checks. You would then transfer this file to the server using a tool like scp, login, and start it in the background using something like python myscript.py &.
Write the program to do a single run only, and use the scheduling tool cron to start it up every minute of the day.
Took a few days but I finally got a way to work this out. The most practical way to get this working is to use a VPS that runs the script. The confusing part of my code was that each user would activate the script at a different time for themselves. To do this, say at midnight, the VPS runs the python script (using scheduled tasking or something similar) and runs the script. the script would then pull times from a database and process the code at those times outlined.
Thanks for your time anyways!

Get certain celery tasks to only run during certain hours

Using Celery, with (if it matters) a rabbitmq broker and django as the result backend.
I have many thousands of tasks, such that it will take weeks to complete them all. However, some of them I only want to be run between certain hours of the day. This is because they involve downloading files from a public server that has explicitly requested this behaviour. Each task takes several minutes to complete. How best to go about this?
You can use Crontab schedules to setup what you need.

Categories

Resources