Is asyncio.loop.time() comparable with datetime.datetime.now() and how? - python

I'm hoping to use an asyncio.loop to set callbacks at specific times. My problem is that I need to schedule these based on datetime.datetime objects (UTC) but asyncio.loop.call_at() uses an internal reference time.
A quick test on python 3.7.3 running on Ubuntu shows that asyncio.loop.time() is reporting the system uptime. For conversion my first thought is to naively store a reference time and use it later:
from asyncio import new_event_loop
from datetime import datetime, timedelta
_loop = new_event_loop()
_loop_base_time = datetime.utcnow() - timedelta(seconds=_loop.time())
def schedule_at(when, callback, *args):
_loop.call_at((when - _loop_base_time).total_seconds(), callback, *args)
However it's not clear whether or not this offset (datetime.utcnow() - timedelta(seconds=loop.time())) is stable. I have no idea whether system up-time drifts in comparison to UTC even where the system clock is modified (eg: through NTP updates).
Bearing in mind this is for monitoring software which will potentially be running for months at a time, small drifts might be very significant. I should note that I've seen systems lose minutes per day without an NTP daemon and one off NTP updates can shift times by many minutes in a short space of time. Since I don't know if the two are kept in sync, it's unclear how much I need to be concerned.
Note: I am aware of python's issue with scheduling events more than 24 hours in the future. I will get round this by storing distant future events in a list and polling for up-coming events every 12 hours, scheduling them only when they are < 24 hours in the future.
Is it possible to reliably convert from datetime.datetime to asyncio.loop times? or are the two time systems incomparable?. If they are comparable, is there anything special I need to do to ensure my calculations are correct.

You could compute the difference in seconds using the same time framework as the one you're using for scheduling, then use asyncio.call_later with the computed delay:
def schedule_at(when, callback, *args):
delay = (when - datetime.utcnow()).total_seconds()
_loop.call_later(delay, callback, *args)
This would work around the question of whether the difference between the loop's time and utcnow is stable; it only needs to be stable between the time of scheduling the task and the time of its execution (which, according to your notes, should be less than 12 hours).
For example: if the event loop's internal clock drifts 1 second apart from utcnow every hour (a deliberately extreme example), you would drift at most 12 seconds per task, but you would not accumulate this error over months of runtime. Compared with the approach of using a fixed reference, this approach gives a better guarantee.

Alternative approach would be not to rely on a loop internal clock at all. You can run a task in background and periodically check if callback should be executed.
This method's inaccuracy corresponds to a time you wait before next check, but I don't think it's critical considering any other possible inaccuracies (like Python GC's stop-the-world, for example).
On a good side is that you aren't limited by 24 hours.
This code shows main idea:
import asyncio
import datetime
class Timer:
def __init__(self):
self._callbacks = set()
self._task = None
def schedule_at(self, when, callback):
self._callbacks.add((when, callback,))
if self._task is None:
self._task = asyncio.create_task(self._checker())
async def _checker(self):
while True:
await asyncio.sleep(0.01)
self._exec_callbacks()
def _exec_callbacks(self):
ready_to_exec = self._get_ready_to_exec()
self._callbacks -= ready_to_exec
for _, callback in ready_to_exec:
callback()
def _get_ready_to_exec(self):
now = datetime.datetime.utcnow()
return {
(when, callback,)
for (when, callback,)
in self._callbacks
if when <= now
}
timer = Timer()
async def main():
now = datetime.datetime.utcnow()
s1_after = now + datetime.timedelta(seconds=1)
s3_after = now + datetime.timedelta(seconds=3)
s5_after = now + datetime.timedelta(seconds=5)
timer = Timer()
timer.schedule_at(s1_after, lambda: print('Hey!'))
timer.schedule_at(s3_after, lambda: print('Hey!'))
timer.schedule_at(s5_after, lambda: print('Hey!'))
await asyncio.sleep(6)
if __name__ == '__main__':
asyncio.run(main())

Related

Python/Django: Schedule task in realtime after every X duration (secs/mins/hours?)

I want to execute some task (function) within my Django application at a specified duration from when a call is made to that. Something like:
... some code
async_run_func(time_interval=15_mins) # Async call. Code within the function
# should be executed after 15 mins.
... some more code
async_run_func is to be executed after some custom interval.
What is the correct approach to achieve this? One way is to create a separate thread and sleep it for time_duration period. But that will result into too many threads on the server. Also, in case the gunicorn process is restarted, the state will be lost. I want the information to persistent. So, I do not want to go with this approach. Currently I am using celery for executing long async and periodic tasks. But celery do not allow option to run a function single time after the specified duration.
It will be great if there is anyway to do it on distributed system. For example, function will be call from one system but the code to be executed on other system (use of queue like RabbitMQ is fine with me). Else, I can also go for executing it on the same machine. Any suggestion?
Celery has the option of enqueuing at a specific time:
your_async_function.apply_async(args=(your, args, tuple),
kwargs={your: kwargs},
countdown=15 * 60)
Or use the subtask syntax, to curry all args and then delay
your_async_function.s(your, args, tuple, your: kwargs).delay(countdown=15 * 60)
If the function has no args, you can skip them and do directly
your_async_function.delay(countdown=15 * 60)
What about using sched module? Simple and efficient.
import sched, time
sc = sched.scheduler(time.time, time.sleep)
sc.enter(15, 1, async_run_func, ())
sc.run
ETA and Countdown are options to perform this using django-celery.
From the document:
The ETA (estimated time of arrival) lets you set a specific date and time that is the earliest time at which your task will be executed. countdown is a shortcut to set ETA by seconds into the future.
For example:
>>> result = add.apply_async((2, 2), countdown=3)
>>> result.get() # this takes at least 3 seconds to return
20
The task is guaranteed to be executed at some time after the specified date and time, but not necessarily at that exact time. Possible reasons for broken deadlines may include many items waiting in the queue, or heavy network latency. To make sure your tasks are executed in a timely manner you should monitor the queue for congestion.
While countdown is an integer, eta must be a datetime object, specifying an exact date and time (including millisecond precision, and timezone information):
>>> from datetime import datetime, timedelta
>>> tomorrow = datetime.utcnow() + timedelta(days=1)
>>> add.apply_async((2, 2), eta=tomorrow)

Timing Code Execution Time

So, I am interested in timing some of the code I am setting up. Borrowing a timer function from the 4th edition of Learning Python, I tried:
import time
reps = 100
repslist = range(reps)
def timer(func):
start = time.clock()
for i in repslist:
ret = func()
elasped = time.clock()-start
return elapsed
Then, I paste in whatever I want to time, and put:
print(timer(func)) #replace func with the function you want to time
When I run it on my code, I do get an answer, but it's nonsense. Suspecting something was wrong, I put a time.sleep(0.1) call in my code, and got a result of 0.8231
Does anybody know why this might be the case or how to fix it? I suspect that the time.clock() call might be at fault.
According to the help docs for clock:
Return the CPU time or real time since the start of the process or since the first call to clock(). This has as much precision as the system records.
The second call to clock already returns the elapsed time between it and the first clock call. You don't need to manually subtract start.
Change
elasped = time.clock()-start
to
elasped = time.clock()
If you want to timer a function perhaps give decorators a try(documentation here):
import time
def timeit(f):
def timed(*args, **kw):
ts = time.time()
result = f(*args, **kw)
te = time.time()
print 'func:%r args:[%r, %r] took: %2.4f sec' % \
(f.__name__, args, kw, te-ts)
return result
return timed
Then when you write a function you just use the decorator, here:
#timeit
def my_example_function():
for i in range(10000):
print "x"
This will print out the time the function took to execute:
func:'my_example_function' args:[(), {}] took: 0.4220 sec
After fixing the typo in the first intended use of elapsed, your code works fine with either time.clock or time.time (or Py3's time.monotonic for that matter) on my Linux system.
The difference would be in the (OS specific) behavior for clock; on most UNIX-like OSes it will return the processor time used by the program since it launched (so time spent blocked, on I/O, locks, page faults, etc. wouldn't count), while on Windows it's a wall clock timer (so time spent blocked would count) that counts seconds since first call.
The UNIX-like version of time.clock is also fairly unreliable if used in a long running program when clock_t is only 32 bits; the value it returns will wrap roughly every 72 minutes of processor time.
Of course, time.time isn't perfect either; it follows the system clock, so an NTP time update (or any other change to the system clock) occurring between calls will give erroneous results (on Python 3.3+, you'd use time.monotonic to avoid this problem). It's also not guaranteed to have granularity finer than 1 second, so if your function doesn't take an awfully long time to run, on a system with low res time.time you won't get particularly useful results.
Really, you should be looking at the Python batteries designed for this (that also handle issues like garbage collection overhead and the like). The timeit module already has a function that does what you want, but handles all the edge cases and issues I mentioned. For example, to time some global function named foo for 100 reps, you'd just do:
import timeit
def foo():
...
print(timeit.timeit('foo()', 'from __main__ import foo', number=100))
It fixes most of the issues I mention by selecting the best timing function for the OS you're on (and also fixes other sources of jitter, e.g. cyclic garbage collection, which is disabled during the test and reenabled at the end).
Even if you don't want to use that for some reason, if you're using Python 3.3 or higher, take a look at the replacements for time.clock, e.g. time.perf_counter (includes time spent sleeping) or time.process_time (includes only CPU time), both of which are portable, reliable, fast, and high resolution for better accuracy.
The time.sleep() will terminate for any signal. read about it here ...
http://www.tutorialspoint.com/python/time_sleep.htm

time.time() drift over repeated calls

I am getting a timestamp every time a key is pressed like this:
init_timestamp = time.time()
while (True):
c = getch()
offset = time.time() - init_timestamp
print("%s,%s" % (c,offset), file=f)
(getch from this answer).
I am verifying the timestamps against an audio recording of me actually typing the keys. After lining the first timestamp up with the waveform, subsequent timestamps drift slighty but consistently. By this I mean that the saved timestamps are later than the keypress waveforms and get later and later as time goes on.
I am reasonably sure the waveform timing is correct (i.e. the recording is not fast or slow), because in the recording I also included the ticking of a very accurate clock which lines up perfectly with the second markers.
I am aware that there are unavoidable limits to the accuracy of time.time(), but this does not seem to account for what I'm seeing - if it was equally wrong on both sides that would be acceptable, but I do not want it to gradually diverge more and more from the truth.
Why would I be seeing this drifting behaviour and what can I do to avoid it?
Just solved this by using time.monotonic() instead of time.time(). time.time() seems to use gettimeofday (at least here it does) which is apparently really bad for measuring walltime differences because of NTP syncing issues:
gettimeofday() and time() should only be used to get the current time if the current wall-clock time is actually what you want. They should never be used to measure time or schedule an event X time into the future.
You usually aren't running NTP on your wristwatch, so it probably won't jump a second or two (or 15 minutes) in a random direction because it happened to sync up against a proper clock at that point. Good NTP implementations try to not make the time jump like this. They instead make the clock go faster or slower so that it will drift to the correct time. But while it's drifting you either have a clock that's going too fast or too slow. It's not measuring the passage of time properly.
(link). So basically measuring differences between time.time() calls is a bad idea.
Depending on which OS you are using you will either need to use time.time() or time.clock().
For windows OS's you will need to use time.clock this give you will clock seconds as a float. time.time() on windows if I remember correctly time.time() is only accurate within 16ms.
For posix systems (linux, osx) you should be using time.time() this is a float which returns the number of seconds since the epoch.
In your code add the following to make your application a little more cross system compatible.
import os
if os.name == 'posix':
from time import time as get_time
else:
from time import clock as get_time
# now use get_time() to return the timestamp
init_timestamp = get_time()
while (True):
c = getch()
offset = get_time() - init_timestamp
print("%s,%s" % (c,offset), file=f)
...

Fast and Precise Python Repeating Timer

I need to send repeating messages from a list quickly and precisely. One list needs to send the messages every 100ms, with a +/- 10ms window. I tried using the code below, but the problem is that the timer waits the 100ms, and then all the computation needs to be done, making the timer fall out of the acceptable window.
Simply decreasing the wait is a messy, and unreliable hack. The there is a Lock around the message loop in the event the list gets edited during the loop.
Thoughts on how to get python to send messages consistently around 100ms? Thanks
from threading import Timer
from threading import Lock
class RepeatingTimer(object):
def __init__(self,interval, function, *args, **kwargs):
super(RepeatingTimer, self).__init__()
self.args = args
self.kwargs = kwargs
self.function = function
self.interval = interval
self.start()
def start(self):
self.callback()
def stop(self):
self.interval = False
def callback(self):
if self.interval:
self.function(*self.args, **self.kwargs)
Timer(self.interval, self.callback, ).start()
def loop(messageList):
listLock.acquire()
for m in messageList:
writeFunction(m)
listLock.release()
MESSAGE_LIST = [] #Imagine this is populated with the messages
listLock = Lock()
rt = RepeatingTimer(0.1,loop,MESSAGE_LIST)
#Do other stuff after this
I do understand that the writeFunction will cause some delay, but not more than the 10ms allowed. I essentially need to call the function every 100ms for each message. The messagelist is small, usually less than elements.
The next challenge is to have this work with every 10ms, +/-1ms :P
Yes, the simple waiting is messy and there are better alternatives.
First off, you need a high-precision timer in Python. There are a few alternatives and depending on your OS, you might want to choose the most accurate one.
Second, you must be aware of the basics preemptive multitasking and understand that there is no high-precision sleep function, and that its actual resolution will differ from OS to OS too. For example, if we're talking Windows, the minimal sleep interval might be around 10-13 ms.
And third, remember that it's always possible to wait for a very accurate interval of time (assuming you have a high-resolution timer), but with a trade-off of high CPU load. The technique is called busy waiting:
while(True):
if time.clock() == something:
break
So, the actual solution is to create a hybrid timer. It will use the regular sleep function to wait the main bulk of the interval, and then it'll start probing the high-precision timer in the loop, while doing the sleep(0) trick. Sleep(0) will (depending on the platform) wait the least possible amount of time, releasing the rest of the remaining time slice to other processes and switching the CPU context. Here is a relevant discussion.
The idea is thoroughly described in the Ryan Geiss's Timing in Win32 article. It's in C and for Windows API, but the basic principles apply here as well.
Store the start time. Send the message. Get the end time. Calculate timeTaken=end-start. Convert to FP seconds. Sleep(0.1-timeTaken). Loop back.
try this:
#!/usr/bin/python
import time; # This is required to include time module.
from threading import Timer
def hello(start, interval, count):
ticks = time.time()
t = Timer(interval - (ticks-start-count*interval), hello, [start, interval, count+1])
t.start()
print "Number of ticks since 12:00am, January 1, 1970:", ticks, " #", count
dt = 1.25 # interval in sec
t = Timer(dt, hello, [round(time.time()), dt, 0]) # start over at full second, round only for testing here
t.start()

How to implement high speed, consistent sampling?

The sort of application to have in mind is an oscilloscope or high speed data logger. I have a function which retrieves the required information, I just need to work out how to call it over and over again, very quickly and with high precision.
There are limitations to time.sleep(), I don't think that is the way to go.
I have looked into the built in event scheduler, but I don't think it's precise enough and doesn't quite fill my needs.
The requirements for this are:
High speed sampling. 10ms is the most that will be asked of it.
High accuracy intervals. At 10ms, a 10% error is acceptable (±1ms).
Fairly low CPU usage, some load is acceptable at 10ms, but it should be less than ~5% for 100ms intervals and beyond. I know this is subjective, I guess what I'm saying is that hogging the CPU is unacceptable.
Ideally, the timer will be initialised with an interval time, and then started when required. The required function should then be called at the correct interval over and over again until the timer is stopped.
It will (not must) only ever run on a Windows machine.
Are there any existing libraries that fulfil these requirements? I don't want to re-invent the wheel, but if I have to I will probably use the Windows multimedia timer (winmm.dll). Any comments/suggestions with that?
I know I'm late to the game answering my own question, but hopefully it will help someone.
I wrote a wrapper to the Windows Multimedia Timer purely as a test. It seems to work well, but the code isn't fully tested and hasn't been optimized.
mmtimer.py:
from ctypes import *
from ctypes.wintypes import UINT
from ctypes.wintypes import DWORD
timeproc = WINFUNCTYPE(None, c_uint, c_uint, DWORD, DWORD, DWORD)
timeSetEvent = windll.winmm.timeSetEvent
timeKillEvent = windll.winmm.timeKillEvent
class mmtimer:
def Tick(self):
self.tickFunc()
if not self.periodic:
self.stop()
def CallBack(self, uID, uMsg, dwUser, dw1, dw2):
if self.running:
self.Tick()
def __init__(self, interval, tickFunc, stopFunc=None, resolution=0, periodic=True):
self.interval = UINT(interval)
self.resolution = UINT(resolution)
self.tickFunc = tickFunc
self.stopFunc = stopFunc
self.periodic = periodic
self.id = None
self.running = False
self.calbckfn = timeproc(self.CallBack)
def start(self, instant=False):
if not self.running:
self.running = True
if instant:
self.Tick()
self.id = timeSetEvent(self.interval, self.resolution,
self.calbckfn, c_ulong(0),
c_uint(self.periodic))
def stop(self):
if self.running:
timeKillEvent(self.id)
self.running = False
if self.stopFunc:
self.stopFunc()
Periodic test code:
from mmtimer import mmtimer
import time
def tick():
print("{0:.2f}".format(time.clock() * 1000))
t1 = mmtimer(10, tick)
time.clock()
t1.start(True)
time.sleep(0.1)
t1.stop()
Output in milliseconds:
0.00
10.40
20.15
29.91
39.68
50.43
60.19
69.96
79.72
90.46
100.23
One-shot test code:
from mmtimer import mmtimer
import time
def tick():
print("{0:.2f}".format(time.clock() * 1000))
t1 = mmtimer(150, tick, periodic=False)
time.clock()
t1.start()
Output in milliseconds:
150.17
As you can see from the results, it's pretty accurate. However, this is only using time.clock() so take them with a pinch of salt.
During a prolonged test with a 10ms periodic timer, CPU usage is around 3% or less on my old dual code 3GHz machine. The machine also seems to use that when it's idle though, so I'd say additional CPU usage is minimal.
Edit: After writing the stuff below, I'd be inclined to implement a similar test for the python event scheduler. I don't see why you think it would be insufficiently accurate.
Something like the following seems to work pretty well under Linux with me (and I have no reason to think it won't work with Windows). Every 10ms, on_timer_event() is called which prints out the time since the last call based on the real-time clock. This shows the approximate accuracy of the timers. Finally, the total time is printed out to show there is no drift.
There seems to be one issue with the code below with events occasionally appearing at spurious (and short intervals). I've no idea why this is, but no doubt with some playing you can make it reliable. I think this sort of approach is the way to go.
import pygame
import time
pygame.init()
TIMER_EVENT = pygame.USEREVENT+1
pygame.time.set_timer(TIMER_EVENT, 10)
timer_count = 0
MAX_TIMER_COUNT = 1000
def on_timer_event():
global last_time
global timer_count
new_time = time.time()
print new_time - last_time
last_time = new_time
timer_count += 1
if timer_count > MAX_TIMER_COUNT:
print last_time - initial_time
pygame.event.post(pygame.event.Event(pygame.QUIT, {}))
initial_time = time.time()
last_time = initial_time
while True:
event = pygame.event.wait()
if event.type == TIMER_EVENT:
on_timer_event()
elif event.type == pygame.QUIT:
break
timed-count was designed for exactly this. It doesn't suffer from temporal drift, so it can be used to repeatedly capture data streams and synchronise them afterwards.
There's a relevant high speed example here.

Categories

Resources