I'm more or less a Python beginner, so I might be using the wrong tools for the job. Or maybe I'm using the correct tools but not correctly.
Here's what I'm trying to do.
def stand_by(self, evt_time, number):
"""Waiting for scheduled time
Starts a scheduler in a separate thread that will run in the background
until a specific time is reached, usually a time in a flight's
schedule. Once that specific time is reached, the program will regularly
check if the state of the flight has changed as expected.
Args:
evt_time (:obj:`datetime`): The scheduled time of the status change.
Schedulers need Unix timestamps, so this has to be converted
inside the function.
number (str): Flight number of the flight that should be observed.
This function itself does not need it but it has to be passed on
as an argument.
"""
# evt_time is given in UTC time but needs to be checked against
# system time
scheduler = sched.scheduler(time.time, time.sleep)
tz = pytz.timezone(player.tz)
# Turns the event time from datetime UTC into datetime player actual
evt_time = pytz.utc.localize(evt_time).astimezone(tz)
# Turns the datetime object into a UNIX time stamp
evt_time = evt_time.timestamp()
threading.Thread(target=lambda: scheduler.enterabs(
evt_time, 2, check_dep, [number])).start()
scheduler.run()
The user selects a flight they want to "board". Let's say it's 1 PM and the flight is scheduled to leave at 2 PM. The program will start a scheduler that later starts a "listener" at 2 PM to see if the flight has departed. To enable the user to continue using the program until 2 PM, the scheduler is running in a separate thread. I'm trying to figure out how I can terminate that thread and/or scheduler if the user decides to choose a different "flight" before 2 PM. I can't seem to find an elegant way of terminating threads.
I have found threading.timer, which can be cancelled, but that takes a time interval, not a time object...
I'm not married at all to this function. What I need is a way to run another function at a fixed time so that the program remains usable and that can be terminated if the user changes their mind. Thanks a lot for ideas!
Related
I would like to schedule a series of absolutely-timed events that will be invoked after an unknown delay. This means that some events might be in the past at the moment we run the scheduler. However, in my application expired events at the start of the run need to be discarded.
Is it possible in Python's sched.py library to instruct the scheduler to discard events in the past at the moment we run the scheduler?
For example, when running a simple sequence of events like this:
import sched
import time
s = sched.scheduler(timefunc=time.time)
now = time.time()
s.enterabs(time=now-5,action=print,argument=(1,),priority=1)
s.enterabs(time=now+2,action=print,argument=(2,),priority=1)
s.enterabs(time=now+4,action=print,argument=(3,),priority=1)
s.run()
I would like to see something like:
2
3
However, the output is:
1
2
3
as the scheduler immediately catches up with past events. Can I somehow override this behaviour? Or is there another library that might better respond to this requirement?
Thank you in advance
I have a python script where a certain job needs to be done at say 8 AM everyday. To do this what i was thinking was have a while loop to keep the program running all the time and inside the while loop use scheduler type package to specify a time where a specific subroutine needs to start. So if there are other routines which run at different times of the day this would work.
def job(t):
print "I'm working...", t
return
schedule.every().day.at("08:00").do(job,'It is 08:00')
Then let windows scheduler run this program and done. But I was wondering if this is terribly inefficient since the while loop is waste of cpu cycles and plus could freeze the computer as the program gets larger in future. Could you please advise if there is a more efficient way to schedule tasks which needs to executed down to the second at the same time not having to run a while loop?
I noted that you have a hard time requirement for executing your script. Just set your Windows Scheduler to start the script a few minutes before 8am. Once the script starts it will start running your schedule code. When your task is done exit the script. This entire process will start again the next day.
and here is the correct way to use the Python module schedule
from time import sleep
import schedule
def schedule_actions():
# Every Day task() is called at 08:00
schedule.every().day.at('08:00').do(job, variable="It is 08:00")
# Checks whether a scheduled task is pending to run or not
while True:
schedule.run_pending()
# set the sleep time to fit your needs
sleep(1)
def job(variable):
print(f"I'm working...{variable}")
return
schedule_actions()
Here are other answers of mine on this topic:
How schedule a job run at 8 PM CET using schedule package in python
How can I run task every 10 minutes on the 5s, using BlockingScheduler?
Execute logic every X minutes (without cron)?
Why a while loop ? Why not just let your Windows Scheduler or on Linux cron job run your simple python script to do whatever, then stop ?
Maintenance tends to become a big problem over time, so try to keep things as lightweight as possible.
I'm trying to implement SLA in my airflow DAG.
I know how SLAs work, you set a timedelta object and if the task does not get done in that duration, it will send an email and notifies that the task is not done yet.
I want some similar functionality, but instead of giving duration, I want to set specific time in SLA. For example, if the task is not done due to 8:00 AM, it sends the email and notifies the manager. Something like this:
'sla': time(hour=8, minute=0, second=0)
I have searched a lot, but nothing found.
Is there any solution for this specific problem? or any other solutions than SLA?
Thanks in advance.
SLA param of BaseOperator expects a datetime.timedelta object, so there is nothing more to do there. Take into consideration that SLA represents a time delta after the scheduled period is over. The example from the docs supposes a DAG scheduled daily:
For example if you set an SLA of 1 hour, the scheduler would send an email soon after 1:00AM on the 2016-01-02 if the 2016-01-01 instance has not succeeded yet.
The point is, it's always a time delta from the schedule period which is not what you are looking for.
So I think you should take another approach, like schedule your DAG whenever you need it, execute the tasks you want and then add a sensor operator to check if the condition you are looking for is met or not. There are a few types of sensors depending on the context you have you could choose from them.
Another option could be, create a new DAG dedicated to check if your tasks executed in the original DAG were successfully executed or not, and act accordingly (for example, send emails, etc.). To do this you could use an ExternalTaskSensor, check online for tutorials on how to implement it, although it may be simpler to avoid cross DAG dependencies as stated in the docs.
Hope that this could point you into the right direction.
Using Airflow 1.8.0 and python 2.7
Having the following DAG (simplified):
(Phase 1)-->(Phase 2)
On phase 1 I'm triggering an async process that is time consuming and can run for up to 2 days, when the process ends it writes some payload on S3. On that period I want the DAG to wait and continue to phase 2 only when the S3 payload exists.
I thought of 2 solutions:
When phase 1 start pause the DAG using the experimental REST API and resume once the process ends.
Wait using an operator that checks for the S3 payload every X minuets.
I can't use option 1 since my admin does not allow the experimental API usage and option 2 seems like a bad practice (checking every X minuets).
Are there any other options to solve my task?
I think Option (2) is the "correct way", you may optimize it a bit:
BaseSensorOperator supports poke_interval, so it should be usable for S3KeySensor to increase the time between tries.
Poke_interval - Time in seconds that the job should wait in between
each tries
Additionally, you could try to use mode and switch it to reschedule:
mode: How the sensor operates.
Options are: { poke | reschedule }, default is poke.
When set to poke the sensor is taking up a worker slot for its
whole execution time and sleeps between pokes. Use this mode if the
expected runtime of the sensor is short or if a short poke interval
is required. Note that the sensor will hold onto a worker slot and
a pool slot for the duration of the sensor's runtime in this mode.
When set to reschedule the sensor task frees the worker slot when
the criteria is not yet met and it's rescheduled at a later time. Use
this mode if the time before the criteria is met is expected to be
quite long. The poke interval should be more than one minute to
prevent too much load on the scheduler.
Not sure about Airflow 1.8.0 - couldn't find the old documentation (I assume poke_interval is supported, but not mode).
I have a scheduling function and a scheduler with a queue of future events ordered by time. I'm using UNIX timestamps and the regular time.time(). One fragment of the scheduler is roughly equivalent to this:
# select the nearest event (eventfunc at eventime)
sleeptime = eventtime - time.time()
# if the sleep gets interrupted,
# the whole block will be restarted
interruptible_sleep(sleeptime)
eventfunc()
where the eventtime could be computed either based on a delay:
eventtime = time.time() + delay_seconds
or based on an exact date and time, e.g.:
eventtime = datetime(year,month,day,hour,min).timestamp()
Now we have the monotonic time in Python. I'm considering to modify the scheduler to use the monotonic time. Schedulers are supposed to use the monotonic time they say.
No problem with delays:
sleeptime = eventtime - time.monotonic()
where:
eventtime = time.monotonic() + delay_seconds
But with the exact time I think the best way is to leave the code as it is. Is that correct?
If yes, I would need two event queues, one based on monotonic time and one based on regular time. I don't like that idea much.
As I said in the comment, your code duplicates the functionality of the sched standard module - so you can as well use solving this problem as a convenient excuse to migrate to it.
That said,
what you're supposed to do if system time jumps forward or backward is task-specific.
time.monotonic() is designed for cases when you need to do things with set intervals between them regardless of anything
So, if your solution is expected to instead react to time jumps by running scheduled tasks sooner or later than it otherwise would, in accordance with the new system time, you have no reason to use monotonic time.
If you wish to do both, then you either need two schedulers, or tasks with timestamps of the two kinds.
In the latter case, the scheduler will need to convert one type to the other (every time it calculates how much to wait/whether to run the next task) - for which time provides no means.