scheduling for an exact time with monotonic time - python

I have a scheduling function and a scheduler with a queue of future events ordered by time. I'm using UNIX timestamps and the regular time.time(). One fragment of the scheduler is roughly equivalent to this:
# select the nearest event (eventfunc at eventime)
sleeptime = eventtime - time.time()
# if the sleep gets interrupted,
# the whole block will be restarted
interruptible_sleep(sleeptime)
eventfunc()
where the eventtime could be computed either based on a delay:
eventtime = time.time() + delay_seconds
or based on an exact date and time, e.g.:
eventtime = datetime(year,month,day,hour,min).timestamp()
Now we have the monotonic time in Python. I'm considering to modify the scheduler to use the monotonic time. Schedulers are supposed to use the monotonic time they say.
No problem with delays:
sleeptime = eventtime - time.monotonic()
where:
eventtime = time.monotonic() + delay_seconds
But with the exact time I think the best way is to leave the code as it is. Is that correct?
If yes, I would need two event queues, one based on monotonic time and one based on regular time. I don't like that idea much.

As I said in the comment, your code duplicates the functionality of the sched standard module - so you can as well use solving this problem as a convenient excuse to migrate to it.
That said,
what you're supposed to do if system time jumps forward or backward is task-specific.
time.monotonic() is designed for cases when you need to do things with set intervals between them regardless of anything
So, if your solution is expected to instead react to time jumps by running scheduled tasks sooner or later than it otherwise would, in accordance with the new system time, you have no reason to use monotonic time.
If you wish to do both, then you either need two schedulers, or tasks with timestamps of the two kinds.
In the latter case, the scheduler will need to convert one type to the other (every time it calculates how much to wait/whether to run the next task) - for which time provides no means.

Related

How to synchronize the start time of python threads?

I want to measure the time delay of a signal. To do that the signal is put on a speaker an the delay when it gets captured by a microphone is estimated. The delay is expected to be in the range of milliseconds, so it is crucial to start the speaker signal and the measurement at the exact same time.
My question is if that can be achieved by using threads:
def play_sound():
# play sound
def record():
# start recording
if __name__ == '__main__':
t1 = threading.Thread(target=play_sound())
t2 = threading.Thread(target=record())
t1.start()
t2.start()
or is there a better way to d it?
I would start the recording thread first and look for the first peak in the signal captured by the mic. This will tell you how many ms after recording started the first sound was detected. For this you probably need to know the sampling rate of the mic etc- here is a good starting point.
The timeline is something like this
---- recording start ------- playback start -------- sound first detected ----
You want to find out how many ms after you start recording a sound was picked up ((first_peak - recording_start) in the code below), and then subtract the time it took to start the playback ((playback_start - recording_start) below)
Here's a rough code outline
from datetime import datetime
recording_start, playback_start, first_peak = None, None, None
def play_sound():
nonlocal playback_start
playback_start = datetime.now()
def record():
nonlocal recording_start, first_peak
recording_start = datetime.now()
first_peak = find_peak_location_in_ms() # implement this
Thread(target=record()).start() # note recording starts first
Thread(target=play_sound()).start()
# once the threads are finished
delay = (first_peak - recording_start) - (playback_start - recording_start)
PS one of the other answers correctly points out that you need to worry about the global interpreter lock. You can likely bypass it by using c-level APIs to record/play the sound without blocking other threads, but you may find Python's not the right tool for that job
It won't be 100% concurrent real-time, but no solution for desktop will ever be. The question then becomes if it is accurate enough for your application. To know this you should simply run a few tests with known delays and see if it works.
You should know about the global interpreter lock: https://docs.python.org/3.3/glossary.html#term-global-interpreter-lock. This means that even on a multicore pc you code won't run truly concurrent.
If this solution is not accurate enough, you should look into the multiprocessing package. https://docs.python.org/3.3/library/multiprocessing.html
Edit: Well, in order to truly get them to start simultaneously you can't start them sequentially after each other like that. You need to use multiprocessing, create the two threads, and then create some kind of interrupt that will start the two threads at the same time. And I think even then you can't be truly sure they will start at the same time because the OS can switch in other stuff (multitasking), and even if that goes fine in the processors itself things might be reordered differently, different code might be cached, etc. On a desktop you can never have the guarantuee that two programs start simultaneously. So the question then becomes if they are consistently simultaneous enough for your purpose. To answer that you will need to find someone with experience in this, or just run a few tests.

How accurate is python's time.sleep() for long time periods?

I'm wondering how accurate python's time.sleep() method is for longer time periods spanning from a few minutes up to a few days.
My concern is, that there might be a drift which will add up when using this method for longer time periods.
Alternatively I have come up with a different solution to end a loop after a certain amount of time has passed:
end = time.time() + 10000
while 1:
if time.time() > end:
break
This is accurate down to a few milliseconds which is fine for my use case and won't drift over time.
Python's time.sleep() function is accurate and should be used in this case as it is simpler and easier to run. An example is
time.sleep(10000) # will stop all running scripts in the same pid
Using a bare while statement without any threshold will use a lot of your resources, which is why you should use a time.sleep expression to reduce this. You also should have used the while statement condition statement as this will make sure your while statement closes.
As shown below
end = time.time() + 10000
while end > time.time(): # ensures to end when time.time (now) is more than end
time.sleep(0.001) # creates a 1 ms gap to decrease cpu usage
I would recommend using the pause module, you can get millisecond precision over a period of days. No need to roll your own here.
https://github.com/jgillick/python-pause
Python's time.sleep() is accurate for any length of time with two little flaws:
The time t must be considered "at least t seconds" as there may be a
number of system events that are scheduled to start at the precise moment
"time when started" + t.
The sleep may be interrupted if the signal handler raises an exception.
I think, but am not certain, that these flaws are found in most programming languages.

Django run tasks (possibly) in the far future

Suppose I have a model Event. I want to send a notification (email, push, whatever) to all invited users once the event has elapsed. Something along the lines of:
class Event(models.Model):
start = models.DateTimeField(...)
end = models.DateTimeField(...)
invited = models.ManyToManyField(model=User)
def onEventElapsed(self):
for user in self.invited:
my_notification_backend.sendMessage(target=user, message="Event has elapsed")
Now, of course, the crucial part is to invoke onEventElapsed whenever timezone.now() >= event.end.
Keep in mind, end could be months away from the current date.
I have thought about two basic ways of doing this:
Use a periodic cron job (say, every five minutes or so) which checks if any events have elapsed within the last five minutes and executes my method.
Use celery and schedule onEventElapsed using the eta parameter to be run in the future (within the models save method).
Considering option 1, a potential solution could be django-celery-beat. However, it seems a bit odd to run a task at a fixed interval for sending notifications. In addition I came up with a (potential) issue that would (probably) result in a not-so elegant solution:
Check every five minutes for events that have elapsed in the previous five minutes? seems shaky, maybe some events are missed (or others get their notifications send twice?). Potential workaroung: add a boolean field to the model that is set to True once notifications have been sent.
Then again, option 2 also has its problems:
Manually take care of the situation when an event start/end datetime is moved. When using celery, one would have to store the taskID (easy, ofc) and revoke the task once the dates have changed and issue a new task. But I have read, that celery has (design-specific) problems when dealing with tasks that are run in the future: Open Issue on github. I realize how this happens and why it is everything but trivial to solve.
Now, I have come across some libraries which could potentially solve my problem:
celery_longterm_scheduler (But does this mean I cannot use celery as I would have before, because of the differend Scheduler class? This also ties into the possible usage of django-celery-beat... Using any of the two frameworks, is it still possible to queue jobs (that are just a bit longer-running but not months away?)
django-apscheduler, uses apscheduler. However, I was unable to find any information on how it would handle tasks that are run in the far future.
Is there a fundemantal flaw with the way I am approaching this? Im glad for any inputs you might have.
Notice: I know this is likely to be somehwat opinion based, however, maybe there is a very basic thing that I have missed, regardless of what could be considered by some as ugly or elegant.
We're doing something like this in the company i work for, and the solution is quite simple.
Have a cron / celery beat that runs every hour to check if any notification needs to be sent.
Then send those notifications and mark them as done. This way, even if your notification time is years ahead, it will still be sent. Using ETA is NOT the way to go for a very long wait time, your cache / amqp might loose the data.
You can reduce your interval depending on your needs, but do make sure they dont overlap.
If one hour is too huge of a time difference, then what you can do is, run a scheduler every hour. Logic would be something like
run a task (lets call this scheduler task) hourly that gets all notifications that needs to be sent in the next hour (via celery beat) -
Schedule those notifications via apply_async(eta) - this will be the actual sending
Using that methodology would get you both of best worlds (eta and beat)

How to write sleep function for my controlled DateTime?

I created a variable
current_time = datetime.datetime.now()
I am increasing time a second in an iteration of while loop.
while True:
current_time = current_time + datetime.timedelta(seconds=1)
current_time is a global variable. I am using it as time in my modules. I want to sleep some functions based on this time.
but if I use
time.sleep()
this will use system time.
So How can I create a sleep function that depends on my current_time?
Edit:- I am implementing an algorithm, storing a value lets call it scheduling time. I want to wait for function up to scheduling time and on time execute the function. There are some functions that will update the scheduling time. So it is a repetitive process. I look for the scheduler library but didn't find anything to use, current_time as time.
I don't want to use system time directly, for me current_time is the time of the system/program. So current_time will increase with speed of the while loop. I am just running while loop without any relation with time, just want to update my time faster because I am using the code to generate months of data in hours. I want to keep my code as virgin as possible. I want to generate data based on this algorithm and will replace my artificial time with the real system time for production use.
If you are trying to sleep until current_time is reached, you can do:
while datetime.datetime.now() <= current_time:
time.sleep(1)
Assuming I'm understanding correctly that your current_time is greater than the actual current time from datetime.datetime.now().

Python 1 second clock timer

I have a Threaded timer that fires every second and updates a clock, the problem is that sometimes the clock will appear to be unstable and it can jump 2 seconds instead of a steady 1 second increment.
The problem of course is that the initial (or subsequent) timer is not triggered at exactly 0:000 seconds and therefore it is possible that updates to the clock appear to jitter.
Is there any way of preventing this ?
from threading import Timer
def timer():
Timer(1.00, timer).start()
STAT['ftime'] = time.strftime("%H:%M:%S")
start_time = time.time()
interval = 1
for i in range(20):
time.sleep(start_time + i*interval - time.time())
# do a thing
Replace '20' with however many seconds you want to time.
There are various approaches how to schedule, some designs may even provide measures to be able to deliver some acceptable kind of remedy for a blocked / failed initiation on a planned scheduling time-line -- what may help is finer timing / hierarchical timing / external synchronisation / asynchronous operations.
Without more details, there would be a poor practice to "recommend", but one may get ispired:
if RealTime constraints allow to bear a bit more overhead, one may go to a "supersampled" elastic, failure-resilient scheduling scenario, so as to avoid a 2 second gap ( in case of a failed one .Timer() initiation ), where threading.Timer() model fires each 50 msec, and an embedded logic decides, if it is the right-time ( not farther than a half of one scheduling interval from an idealised one second edge ) and does the real-job, that was intended to be run, or just return, in case the RTC is not "near" the planned idealised scheduling time.
a good python design also cannot forget about problems with GIL-lock issues, with avoiding blockingIO(s), with implementing a reasonable task-segmentation for CPU-bound parts of the code

Categories

Resources