Google App Engine Python Cron Job - python

I wanted to run my cron job as 'schedule: every saturday every 2 minutes from 01:00 to 3:00', and it won't allow this format. Is it possible to set a cron job to target another cron job? Or is my schedule possible just not in the correct format?

Unfortunately, you cannot combine the weekday option with the interval.
You could add a switch in the request handler of your cron-job, that will just exit if current week-day is not Saturday, while your cron.job is scheduled "every 2 minutes from 01:00 to 03:00". But that means that your handler will be called 300 times per week for doing nothing, and only doing something the other 60 times.
Alternatively, you could combine an "every saturday 01:00" cron-job (as dispatcher) that will create 60 push tasks (as worker) with countdown or ETA, spread between 01:00 and 03:00. However, I don't think the execution time is not guaranteed.

Related

airflow schedule DAG with unusual times

I'm trying to figure out the best way to schedule a DAG in airflow that doesn't conform to the ways that they are typically scheduled.
The times I want the DAG to run are between 9:40 AM and 4:00 PM, Monday-Friday and to run every ten minutes.
1) cron could sort of work here as I could set up multiple DAGs that execute the same code and give them different cron triggers. For instance, trigger the first to run at 9:40 and run once, run the second at 9:50 (also run once) and then the third run from 10 AM to 4, Mon-Fri every ten minutes.
2) The airflow preset (eg #hourly) or timer interval also wouldn't really work here either, since as far as I can tell there is no way to set up a timer interval with the weird start time (9:40 AM) and the Mon-Fri restriction. But at least here I can set the timedelta to 10 minutes.
3) The other option would be to set the scheduler to None and have a second script externally trigger the DAG, using the subprocessing module.
In my ideal scenario, I could write a generator which would give python datetimes that I want the dag to be triggered and give that to the DAG object. I guess I could combine that solution with 3 above.
Solution 1 could work, but seems hacky.
Wanted to know what other folks have done in this situation.
*/10 9-16 1-5 * *
This CRON will give you a run every 10 minutes between 9am and 4pm ( 16 hours ) and only Mondays to Fridays ( 0-5 ).
I don't know how you can get the finer granularity to have 9:40am to 4pm.
*/10 indicates a run every 10 minutes
9-16 indicates runs only between hour 9 and hour 16
1-5 indicates runs as per the following table:
0 - Sun Sunday
1 - Mon Monday
2 - Tue Tuesday
3 - Wed Wednesday
4 - Thu Thursday
5 - Fri Friday
6 - Sat Saturday
7 - Sun Sunday

Scheduling a method to run at specific times

I'm making an app in python to send texts via Twilio. I'm using flask and it's hosted on Google App Engine. I have a list of messages that need to be sent at a specific date and time, by calling my message function. What's a simple way to go about creating this? I'm relatively new to all this.
I tried apscheduler, but it only worked on my local and not on the app engine. I've read about cron jobs, but can't find anything about specific dates/times or how to pass args when the job runs.
As mention in the comments by Fabio, you could make a cron task to run every 10 min (or every minute). I would look into a folder for messages to send. If you would make a filename format in that folder to start with the date and time, you could do something like that :
folder content:
201707092205_<#message_id>
pseudo-code for sending the message:
intant_when_the_script_is_ran = datetime.now().strftime(format_to_the_minute)
for file in folder:
if intant_when_the_script_is_ran in file
with open(file, 'rw') as fh:
destination = fh.readline() #reading the fisrt line
message = fh.readlines() #reading the rest of the message
twilioapi.sendmessage(destination, message)
os.remove(file) #the remove could be done in another script to leave some traces
This is where Google app engine come in handy. You can use cron jobs from app engine. Create a cron.yaml file in your project.In this file you can make all kind of scheduling option every day to one day in a week in a particular time. The following is an example cron.yaml file
cron:
- description: "daily summary job"
url: /tasks/summary
schedule: every 24 hours
- description: "monday morning mailout"
url: /mail/weekly
schedule: every monday 09:00
timezone: Australia/NSW
- description: "new daily summary job"
url: /tasks/summary
schedule: every 24 hours
target: beta
Cron schedules are specified using a simple English-like format.
every 12 hours
every 5 minutes from 10:00 to 14:00
every day 00:00
every monday 09:00
2nd,third mon,wed,thu of march 17:00
1st monday of sep,oct,nov 17:00
1 of jan,april,july,oct 00:00
for more well-explained scheduling format please refer this documentation.
Another option is to use a taskqueue task where you specify the time that the task should be run using the eta option (estimated time of arrival).
The task will sit in the queue until its time of execution arrives, and then GAE will cause the task to be launched to do whatever processing you need such as sending a text message.
The tasks may not be executed at the precise time you specify but in my experience it is generally quite close. Certainly far more accurate than running a CRON job every 10 minutes.
This will also be much more efficient than using a CRON job because a CRON job will cause a request to your app every 10 minutes but the task will only execute when needed. If you have a low volume app this may help you stay within the free quota.

What does the landing time mean in airflow?

There is a section called "landing time" in the DAG view on the web console of airflow.
An example screen shot taken from airbnb's blog:
But what does it mean? There is no definition in the documents or in their repository.
Since the existing answer here wasn't totally clear, and this is the top hit for "airflow landing time" I went to the chat archives and found the original answer being referenced here:
Maxime Beauchemin #mistercrunch Jun 09 2016 11:12
it's the number of hours after the time the scheduling period ended
take a schedule_interval='#daily' run for 2016-01-01 that finishes at 2016-01-02 03:52:00
landing time is 3:52
https://gitter.im/apache/incubator-airflow/archives/2016/06/09
It seems the Y axis is in hours, and the negative landing times are a result of running jobs manually so they finish hours before they "should have finished" based on the schedule.
I directly asked the author Maxime. His answer was landing_time is when the job completes minus when the job should have started (for airflow, it's the end of the scheduled period).
source:
http://gitter.im/apache/incubator-airflow
It is a good place to get help and Maxine is very nice and helpful. But the answers are not persistent..
For me its easier to understand landing_time using an example.
So let's say we have a dag scheduled to run daily at 0 0 * * *. This dag has 2 tasks that execute sequentially:
first_task >> second_task
The first_task starts at 00:00 and 10 seconds and finishes after 5 minutes at 00:05:10.
The landing_time for first_task will be 5 mins and 10 seconds.
The second_task starts execution at 00:07:00 minute and finishes after 2 minutes. The landing_time for the second_task would be 9 minutes.
So we just delete from the task end_time the dag execution_date.
Thanks to #Kalinde Pride for commenting and pointing me to the only source of truth, the airflow code base.
I usually use landing_time as a measure and metric of the performance of the whole airflow system. For example increase in landing_times in the first tasks seems to mean that scheduler is under heavy load or we should adapt task parallelization (through airflow.cfg).
Landing Times: Total time spent including retries.

Multiple Time Zones in Google Appengine Cron Job

I want to schedule a task for 9:00 AM in every country. (basically 9:00 AM in every time zone). How can I schedule that in google appengine?
Will it take multiple timezones for time zone parameter?
Thanks in advance
You can schedule a cron job to run every hour, because every hour there is 9 am somewhere.

How to set Google App Engine cron job using different interval in different period of time?

How to config a cron job to run every 5 minutes between 9:00am~20:00pm,
but every 10 minutes in other time of the day.
I would recommend just using every 5 minutes synchronized in the cron.yaml, and then just terminate immediately in the handler if the exact time is not to your liking (hour before 9 or after 20 and minute // 5 is odd, for example). GAE's cron is not very sophisticated, but running a trivial handler which just gets the time, checks whether that's OK, and terminates immediately otherwise, is pretty simple and cheap (and the 70 or so "extra hits per day", each with a trivial amount of resource consumption, will hardly make a difference to your app's overall resource consumption anyway).
The new API for cron now can do it. Please check the document at: https://cloud.google.com/appengine/docs/python/config/cron#Python_app_yaml_The_schedule_format
every 12 hours
every 5 minutes from 10:00 to 14:00
every day 00:00
every monday 09:00
2nd,third mon,wed,thu of march 17:00
1st monday of sep,oct,nov 17:00
1 of jan,april,july,oct 00:00

Categories

Resources