I am currently working on a program that needs to run every 14 days. I have looked into Schedule which works fine, but I have a few doubts about how to go about this.
I will create a service which will handle the execution of the python program itself on a CentOS 7 system.
The issue here is that every 14 days I will run a function that generates a lot of email addresses and send them to a support entity. I am afraid that if something unintended happens, and the program restart - the support entity will get spammed with emails outside the time frame in which they should receive emails.
As far as I can tell, Schedule does not have any way of determining if the program has restarted, and therefore a reboot of either the system or the service will cause this behaviour.
Would it be a correct solution to write a date to a text file after each completed function run, and then check that text file once a day to determine whether the function should run or not? This method would survive a service and/or system reboot, but is it a "correct" way of doing it?
****UPDATE**** Having the cronjob run on specific days of the month (for example 1st and 15th.) is not sufficient. This could cause gaps in the data which the program processes. The script makes a call which pulls data from 14 days back, and this is the maximum number of days supported by the script (licensing and stuff, can't be changed so not that important except that it is a limitation). So it need to run on lets say odd or even week numbers (to get 14 days).
Any ideas on how to accomplish this given this new information?.
You should look into the use of cron (or google it yourself if you dont like the link).
I suggest creating a simple Python script that is called by cron every 14 days. The crontab entry could look like the following:
# this will run at 00:01 on the 15th and 30th of every month
1 0 */15 * * /path/to/python/script.py
# this will run at 00:01 on the 1st and 15th of every month
1 0 1,15 * * /path/to/python/script.py
You still could make your script write some sort of result (with maybe a timestamp) to a file, so that you could easily check that file to see if it ran correctly (or log some error info).
# this will run at 00:01 on the 1st and 15th of every month
1 0 1,15 * * /path/to/python/script.py >> /path/to/logfile.log 2>&1
EDIT
You can also configure cron to run every Monday (or another day) if the 1st and 15th of every month are not sufficient. And the script could check a log file to see if it was run the previous Monday to assure it only executes your business logic every 2 weeks.
# this will run at 00:01 once a week on Mondays
1 0 * * 1 /path/to/python/script.py >> /path/to/logfile.log 2>&1
Related
How can I write a code that will run a task for example "everyday (or every 24 hours)at 3:20 a.m."?
The main problem is "3:20" part, how do I make cron task at this exact time?
You simply specify the minute and hour that you want it to occur on, and use * for the other values. For example: 20 3 * * * will run at 3:20 AM every day forever.
You can experiment with cron-schedules using this website to get a better understanding for how the syntax works: https://crontab.guru/#20_3_*_*_*
I am writing a python script to transfer large files via sftp with the pysftp module. I have a massive amount of data to transfer, a total of around 36Tb, divided in 54 runs, or batches.
I want only to carry out these transfers between certain hours of the day, for this example, between 6pm and 7am. So my idea is to use a for loop to iterate over all the runs/ batches. Upon each iteration, I would check what hour it is. If it is between 6pm and 7am I would transfer. Else the script would sleep until it is 6pm minimum. The code that I wrote looks like so:
runsList = 'runA runB runC'.split() # these are directories
# time constraints
bottomLimit = 7
upperLimit = 18
doNotUploadRange = range(bottomLimit, upperLimit)
for run in runsList:
hour = dt.datetime.now().hour
while hour in doNotUploadRange:
print('do not upload now')
time.sleep(1800)
hour = dt.datetime.now().hour
# when I leave the while condition above
# do the transfer via pysftp (large amount of data) per run
The question here does not concern the code itself not I want to check whether or not the script is running (which can be checked with htop), but I am concerned that my script will crash, for whatever reason, before it finishes (perhaps it would be running for a full week if nothing crashes).
I do sometimes call scripts that run for a very long time and they do crash sometimes, with no obvious reasons for crash.
So my question is whether it is, for whatever reason, obvious that the script will crash after running for 6-7 days of can I expect it to finish provided that there is no error in the code itself? My idea is to call this script on the background, inside tmux I would python script.py &
I have a python script that is triggered by CRON every 5 mins. This python script then does some checks and then calls another python script many times with different configurations. Those other python scripts are trading bots and they need to be called separately for each trading pair so I end up calling this second script about 50 times (each time with a different config).
The problem I have is that between my first python script being started by CRON and then the subsequent python scripts being called, about 5 to 8 seconds pass (I am using subprocess.Popen to open the secondary scripts) and this 8 seconds makes a huge difference in trading.
So what I would like to know, is if there is a way to offset my system clock by say, 5 seconds so that the secondary scripts are called as close as possible to the actual start of the minute?
I'm making an app in python to send texts via Twilio. I'm using flask and it's hosted on Google App Engine. I have a list of messages that need to be sent at a specific date and time, by calling my message function. What's a simple way to go about creating this? I'm relatively new to all this.
I tried apscheduler, but it only worked on my local and not on the app engine. I've read about cron jobs, but can't find anything about specific dates/times or how to pass args when the job runs.
As mention in the comments by Fabio, you could make a cron task to run every 10 min (or every minute). I would look into a folder for messages to send. If you would make a filename format in that folder to start with the date and time, you could do something like that :
folder content:
201707092205_<#message_id>
pseudo-code for sending the message:
intant_when_the_script_is_ran = datetime.now().strftime(format_to_the_minute)
for file in folder:
if intant_when_the_script_is_ran in file
with open(file, 'rw') as fh:
destination = fh.readline() #reading the fisrt line
message = fh.readlines() #reading the rest of the message
twilioapi.sendmessage(destination, message)
os.remove(file) #the remove could be done in another script to leave some traces
This is where Google app engine come in handy. You can use cron jobs from app engine. Create a cron.yaml file in your project.In this file you can make all kind of scheduling option every day to one day in a week in a particular time. The following is an example cron.yaml file
cron:
- description: "daily summary job"
url: /tasks/summary
schedule: every 24 hours
- description: "monday morning mailout"
url: /mail/weekly
schedule: every monday 09:00
timezone: Australia/NSW
- description: "new daily summary job"
url: /tasks/summary
schedule: every 24 hours
target: beta
Cron schedules are specified using a simple English-like format.
every 12 hours
every 5 minutes from 10:00 to 14:00
every day 00:00
every monday 09:00
2nd,third mon,wed,thu of march 17:00
1st monday of sep,oct,nov 17:00
1 of jan,april,july,oct 00:00
for more well-explained scheduling format please refer this documentation.
Another option is to use a taskqueue task where you specify the time that the task should be run using the eta option (estimated time of arrival).
The task will sit in the queue until its time of execution arrives, and then GAE will cause the task to be launched to do whatever processing you need such as sending a text message.
The tasks may not be executed at the precise time you specify but in my experience it is generally quite close. Certainly far more accurate than running a CRON job every 10 minutes.
This will also be much more efficient than using a CRON job because a CRON job will cause a request to your app every 10 minutes but the task will only execute when needed. If you have a low volume app this may help you stay within the free quota.
I have a python script monitor.py that analyzes text log. The log get generated by 'logging' module inside each python script from Scheduled Job and all results get logged to e.g. C:\log.txt file. I scheduled to run monitor.py script every hour. I don't want this c:\log.txt get growing and accumulate hence I think it would be a good idea to delete it sometime after midnight. Note: I don't have other scheduled jobs at night hence it will not have impact.
I want to check the current time, if the time is between 12:00 AM and 1:00 AM i.e. between midnight and 1 AM I will delete C:\log.txt and immidiately generate a new c:\log.txt file. I noticed that Scheduled Job on windows starts not exactly time it was scheduled but a few seconds before hence my prototype would be:
1. check if current time is between 23:59 PM and 1:00 AM
2. in case 1. is 'true' -> delete c:\log.txt and create a new c:\log.txt
My only problem is that I don't know how could create condition like:
1:00 AM < current time > 23:59 PM
Could someone help me on it?
Thanks
I don't want this c:\log.txt get growing
You can use https://docs.python.org/2.7/library/logging.handlers.html#rotatingfilehandler or https://docs.python.org/2.7/library/logging.handlers.html#timedrotatingfilehandler to limit a logfile size.