accurately measure time python function takes - python

I need to measure the time certain parts of my program take (not for debugging but as a feature in the output). Accuracy is important because the total time will be a fraction of a second.
I was going to use the time module when I came across timeit, which claims to avoid a number of common traps for measuring execution times. Unfortunately it has an awful interface, taking a string as input which it then eval's.
So, do I need to use this module to measure time accurately, or will time suffice? And what are the pitfalls it refers to?
Thanks

According to the Python documentation, it has to do with the accuracy of the time function in different operating systems:
The default timer function is platform
dependent. On Windows, time.clock()
has microsecond granularity but
time.time()‘s granularity is 1/60th of
a second; on Unix, time.clock() has
1/100th of a second granularity and
time.time() is much more precise. On
either platform, the default timer
functions measure wall clock time, not
the CPU time. This means that other
processes running on the same computer
may interfere with the timing ... On Unix, you can
use time.clock() to measure CPU time.
To pull directly from timeit.py's code:
if sys.platform == "win32":
# On Windows, the best timer is time.clock()
default_timer = time.clock
else:
# On most other platforms the best timer is time.time()
default_timer = time.time
In addition, it deals directly with setting up the runtime code for you. If you use time you have to do it yourself. This, of course saves you time
Timeit's setup:
def inner(_it, _timer):
#Your setup code
%(setup)s
_t0 = _timer()
for _i in _it:
#The code you want to time
%(stmt)s
_t1 = _timer()
return _t1 - _t0
Python 3:
Since Python 3.3 you can use time.perf_counter() (system-wide timing) or time.process_time() (process-wide timing), just the way you used to use time.clock():
from time import process_time
t = process_time()
#do some stuff
elapsed_time = process_time() - t
The new function process_time will not include time elapsed during sleep.
Python 3.7+:
Since Python 3.7 you can also use process_time_ns() which is similar to process_time()but returns time in nanoseconds.

You could build a timing context (see PEP 343) to measure blocks of code pretty easily.
from __future__ import with_statement
import time
class Timer(object):
def __enter__(self):
self.__start = time.time()
def __exit__(self, type, value, traceback):
# Error handling here
self.__finish = time.time()
def duration_in_seconds(self):
return self.__finish - self.__start
timer = Timer()
with timer:
# Whatever you want to measure goes here
time.sleep(2)
print timer.duration_in_seconds()

The timeit module looks like it's designed for doing performance testing of algorithms, rather than as simple monitoring of an application. Your best option is probably to use the time module, call time.time() at the beginning and end of the segment you're interested in, and subtract the two numbers. Be aware that the number you get may have many more decimal places than the actual resolution of the system timer.

I was annoyed too by the awful interface of timeit so i made a library for this, check it out its trivial to use
from pythonbenchmark import compare, measure
import time
a,b,c,d,e = 10,10,10,10,10
something = [a,b,c,d,e]
def myFunction(something):
time.sleep(0.4)
def myOptimizedFunction(something):
time.sleep(0.2)
# comparing test
compare(myFunction, myOptimizedFunction, 10, input)
# without input
compare(myFunction, myOptimizedFunction, 100)
https://github.com/Karlheinzniebuhr/pythonbenchmark

Have you reviewed the functionality provided profile or cProfile?
http://docs.python.org/library/profile.html
This provides much more detailed information than just printing the time before and after a function call. Maybe worth a look...

The documentation also mentions that time.clock() and time.time() have different resolution depending on platform. On Unix, time.clock() measures CPU time as opposed to wall clock time.
timeit also disables garbage collection when running the tests, which is probably not what you want for production code.
I find that time.time() suffices for most purposes.

From Python 2.6 on timeit is not limited to input string anymore. Citing the documentation:
Changed in version 2.6: The stmt and setup parameters can now also take objects that are callable without arguments. This will embed calls to them in a timer function that will then be executed by timeit(). Note that the timing overhead is a little larger in this case because of the extra function calls.

Related

How do I time python code, similar to unix time command? [duplicate]

I am working on a Python script that is going to be run in the command line. The idea is to get a command from the user, run it and then provide the wall-clock time and the CPU time of the command provided by the user. See code below.
#!/usr/bin/env python
import os
import sys
def execut_cmd(cmd_line):
utime = os.system('time '+cmd_line)
# Here I would like to store the wall-clock time in the Python variable
# utime.
cputime = os.system('time '+cmd_line)
# Here the CPU time in the cputime variable. utime and cputime are going to
# be used later in my Python script. In addition, I would like to silence the
# output of time in the screen.
execut_cmd(sys.argv[1])
print ('Your command wall-clock time is '+utime)
print ('Your command cpu time is '+ cputime)
How can I accomplish this? Also, if there is a better method than using 'time' I am open to try it.
From Python Documentation for wall time:
... On Windows, time.clock() has microsecond granularity, but time.time()’s granularity is 1/60th of a second. On Unix, time.clock() has 1/100th of a second granularity, and time.time() is much more precise. On either platform, default_timer() measures wall clock time, not the CPU time. This means that other processes running on the same computer may interfere with the timing.
For wall time you can use timeit.default_timer() which gets the timer with best granularity described above.
From Python 3.3 and above you can use time.process_time() or time.process_time_ns() . Below is the documentation entry for process_time method:
Return the value (in fractional seconds) of the sum of the system and user CPU time of the current process. It does not include time elapsed during sleep. It is process-wide by definition. The reference point of the returned value is undefined, so that only the difference between the results of consecutive calls is valid.
To provide the current wall time, time.time() can be used to get the epoch time.
To provide the elapsed wall time, time.perf_counter() can be used at the start and end of the operation with the difference in results reflecting the elapsed time. The results cannot be used to give an absolute time as the reference point is undefined. As mentioned in other answers, you can use timeit.default_time() but this will always return time.perf_counter() as of python 3.3
To provide the elapsed CPU time, time.process_time() can be used in a similar manner to time.perf_counter(). This will provide the sum of the system and user CPU time.
With the little time I have spent using the timing functions on Linux systems. I have observed that
timeit.default_timer() and time.perf_counter() numerically gives the same result.
Also, when measuring the duration of a time interval, timeit.default_timer(), time.perf_counter() and time.time() all virtually gives the same result. So this means that any of these functions can be used to measure the elapsed time or wall time for any process.
I think I should also mention that the difference between time.time() and others is that it gives the current time in seconds from epoch which is from 1 January 1970 12:00AM
time.clock() and time.process_time() also gives the same numerical value
time.process_time() is most suitable for measuring the cpu time since time.clock() is already deprecated in python 3

Python CPU Clock. time.clock() vs time.perf_counter() vs time.process_time()

I wanted to measure how long sections of my code run. I needed this to be deterministic so that I get the same duration every time (seconds/milliseconds/etc) regardless of what is happening in the background. So i wanted to measure CPU time using time.clock() on unix.
time.clock() has since been decrecated and its suggested to switch to either perf_counter or process_time. I was looking at the python doc and found the following.
time.perf_counter() → float Return the value (in fractional seconds)
of a performance counter, i.e. a clock with the highest available
resolution to measure a short duration. It does include time elapsed
during sleep and is system-wide. The reference point of the returned
value is undefined, so that only the difference between the results of
consecutive calls is valid.
time.process_time() → float Return the value (in fractional seconds)
of the sum of the system and user CPU time of the current process. It
does not include time elapsed during sleep. It is process-wide by
definition. The reference point of the returned value is undefined, so
that only the difference between the results of consecutive calls is
valid.
Since I am not using any form of sleep in my code, the primary difference seems to be 'process-wide' and 'system-wide'. Could sometime elaborate the difference?
Secondly, is this the correct way to do this?
Both perf_counter and timeit will give you the time that your block of code tested had taken to perform.
time.process_time() it does not and it calculates what the CPU have taken, which is not necessarily the same as the function or block of code.
I found this thread on Github, seems that the question is quite advanced and may be completely different depending on the OS or program to be benchmark-ed.
Something that time.process_time() is not counting is the Parent Multi-thread:
"One consequence of using time.process_time is that the time spent in child processes of the benchmark is not included. Multithreaded benchmarks also return the total CPU time counting all CPUs."
perf_counter
from time import perf_counter
start = perf_counter()
for _ in range(10000):
x = "-".join(str(n) for n in range(100))
end = perf_counter()
print('Perf Counter= ', end-start)
# Perf Counter= 0.23170840000000004
timeit
import timeit
print(timeit.timeit('"-".join(str(n) for n in range(100))', number=10000))
# 0.20687929999999993

Timing Code Execution Time

So, I am interested in timing some of the code I am setting up. Borrowing a timer function from the 4th edition of Learning Python, I tried:
import time
reps = 100
repslist = range(reps)
def timer(func):
start = time.clock()
for i in repslist:
ret = func()
elasped = time.clock()-start
return elapsed
Then, I paste in whatever I want to time, and put:
print(timer(func)) #replace func with the function you want to time
When I run it on my code, I do get an answer, but it's nonsense. Suspecting something was wrong, I put a time.sleep(0.1) call in my code, and got a result of 0.8231
Does anybody know why this might be the case or how to fix it? I suspect that the time.clock() call might be at fault.
According to the help docs for clock:
Return the CPU time or real time since the start of the process or since the first call to clock(). This has as much precision as the system records.
The second call to clock already returns the elapsed time between it and the first clock call. You don't need to manually subtract start.
Change
elasped = time.clock()-start
to
elasped = time.clock()
If you want to timer a function perhaps give decorators a try(documentation here):
import time
def timeit(f):
def timed(*args, **kw):
ts = time.time()
result = f(*args, **kw)
te = time.time()
print 'func:%r args:[%r, %r] took: %2.4f sec' % \
(f.__name__, args, kw, te-ts)
return result
return timed
Then when you write a function you just use the decorator, here:
#timeit
def my_example_function():
for i in range(10000):
print "x"
This will print out the time the function took to execute:
func:'my_example_function' args:[(), {}] took: 0.4220 sec
After fixing the typo in the first intended use of elapsed, your code works fine with either time.clock or time.time (or Py3's time.monotonic for that matter) on my Linux system.
The difference would be in the (OS specific) behavior for clock; on most UNIX-like OSes it will return the processor time used by the program since it launched (so time spent blocked, on I/O, locks, page faults, etc. wouldn't count), while on Windows it's a wall clock timer (so time spent blocked would count) that counts seconds since first call.
The UNIX-like version of time.clock is also fairly unreliable if used in a long running program when clock_t is only 32 bits; the value it returns will wrap roughly every 72 minutes of processor time.
Of course, time.time isn't perfect either; it follows the system clock, so an NTP time update (or any other change to the system clock) occurring between calls will give erroneous results (on Python 3.3+, you'd use time.monotonic to avoid this problem). It's also not guaranteed to have granularity finer than 1 second, so if your function doesn't take an awfully long time to run, on a system with low res time.time you won't get particularly useful results.
Really, you should be looking at the Python batteries designed for this (that also handle issues like garbage collection overhead and the like). The timeit module already has a function that does what you want, but handles all the edge cases and issues I mentioned. For example, to time some global function named foo for 100 reps, you'd just do:
import timeit
def foo():
...
print(timeit.timeit('foo()', 'from __main__ import foo', number=100))
It fixes most of the issues I mention by selecting the best timing function for the OS you're on (and also fixes other sources of jitter, e.g. cyclic garbage collection, which is disabled during the test and reenabled at the end).
Even if you don't want to use that for some reason, if you're using Python 3.3 or higher, take a look at the replacements for time.clock, e.g. time.perf_counter (includes time spent sleeping) or time.process_time (includes only CPU time), both of which are portable, reliable, fast, and high resolution for better accuracy.
The time.sleep() will terminate for any signal. read about it here ...
http://www.tutorialspoint.com/python/time_sleep.htm

time.time() drift over repeated calls

I am getting a timestamp every time a key is pressed like this:
init_timestamp = time.time()
while (True):
c = getch()
offset = time.time() - init_timestamp
print("%s,%s" % (c,offset), file=f)
(getch from this answer).
I am verifying the timestamps against an audio recording of me actually typing the keys. After lining the first timestamp up with the waveform, subsequent timestamps drift slighty but consistently. By this I mean that the saved timestamps are later than the keypress waveforms and get later and later as time goes on.
I am reasonably sure the waveform timing is correct (i.e. the recording is not fast or slow), because in the recording I also included the ticking of a very accurate clock which lines up perfectly with the second markers.
I am aware that there are unavoidable limits to the accuracy of time.time(), but this does not seem to account for what I'm seeing - if it was equally wrong on both sides that would be acceptable, but I do not want it to gradually diverge more and more from the truth.
Why would I be seeing this drifting behaviour and what can I do to avoid it?
Just solved this by using time.monotonic() instead of time.time(). time.time() seems to use gettimeofday (at least here it does) which is apparently really bad for measuring walltime differences because of NTP syncing issues:
gettimeofday() and time() should only be used to get the current time if the current wall-clock time is actually what you want. They should never be used to measure time or schedule an event X time into the future.
You usually aren't running NTP on your wristwatch, so it probably won't jump a second or two (or 15 minutes) in a random direction because it happened to sync up against a proper clock at that point. Good NTP implementations try to not make the time jump like this. They instead make the clock go faster or slower so that it will drift to the correct time. But while it's drifting you either have a clock that's going too fast or too slow. It's not measuring the passage of time properly.
(link). So basically measuring differences between time.time() calls is a bad idea.
Depending on which OS you are using you will either need to use time.time() or time.clock().
For windows OS's you will need to use time.clock this give you will clock seconds as a float. time.time() on windows if I remember correctly time.time() is only accurate within 16ms.
For posix systems (linux, osx) you should be using time.time() this is a float which returns the number of seconds since the epoch.
In your code add the following to make your application a little more cross system compatible.
import os
if os.name == 'posix':
from time import time as get_time
else:
from time import clock as get_time
# now use get_time() to return the timestamp
init_timestamp = get_time()
while (True):
c = getch()
offset = get_time() - init_timestamp
print("%s,%s" % (c,offset), file=f)
...

time.clock() and time.time() resolution in Python2/3

I'm getting really, really confused about the precision of the results of the functions above.
To me the documentation isn't clear at all, for example here are two sentences:
from time module documentation
The precision of the various real-time functions may be less than suggested by the units in which their value or argument is expressed. E.g. on most Unix systems, the clock “ticks” only 50 or 100 times a second.
from timeit module documentation
Define a default timer, in a platform-specific manner. On Windows, time.clock() has microsecond granularity, but time.time()‘s granularity is 1/60th of a second. On Unix, time.clock() has 1/100th of a second granularity, and time.time() is much more precise. On either platform, default_timer() measures wall clock time, not the CPU time. This means that other processes running on the same computer may interfere with the timing.
Now because real-time, in Unix, it is returned by time.time() and it has a resolution far better than 1/100 how can it just "ticks" 50 or 100 times a second?
Always about resolution, I can't understand what the exact resolution I get calling each function, so I tried the followings and I put my guesses in the comments:
>>> time.clock()
0.038955 # a resolution of microsecond?
>>> time.time()
1410633457.0955694 # a resolution of 10-7 second?
>>> time.perf_counter()
4548.103329075 # a resolution of 10-9 second (i.e nanosecond)?
P.S. This was tried on Python3.4.0, in Python2 for time.clock() and time.time() I always get 6 numbers after the dot, so 1us precision?
Precision relates to how often the value changes.
If you could call any of these functions infinitely fast, each of these functions would return a new value at different rates.
Because each returns a floating point value, which doesn't have absolute precision, you cannot tell anything from their return values as to what precision they have. You'll need to measure how the values change over time to see what their precision is.
To show the differences, run:
import time
def average_deltas(*t):
deltas = [t2 - t1 for t1, t2 in zip(t, t[1:])]
return sum(deltas) / len(deltas)
for timer in time.clock, time.time, time.perf_counter:
average = average_deltas(*(timer() for _ in range(1000))) * 10 ** 6
print('{:<12} {:.10f}'.format(timer.__name__, average))
On my Mac this prints:
clock 0.6716716717
time 0.2892525704
perf_counter 0.1550070010
So perf_counter has the greatest precision on my architecture, because it changes more often per second, making the delta between values smaller.
You can use the time.get_clock_info() function to query what precision each method offers:
>>> for timer in time.clock, time.time, time.perf_counter:
... name = timer.__name__
... print('{:<12} {:.10f}'.format(name, time.get_clock_info(name).resolution))
...
clock 0.0000010000
time 0.0000010000
perf_counter 0.0000000010
Just want to update this as it has changed a bit recently.
Using Python3 version Python 3.8.11 in ubuntu
There is no time.clock
The delta method in the accepted answer don't give good metrics. Run them a good few times, and swap orders, and you will see bad variations.
However...
import time
for timer in time.time, time.perf_counter:
name = timer.__name__
print('{:<12} {:.10f}'.format(name, time.get_clock_info(name).resolution))
time 0.0000000010
perf_counter 0.0000000010
Both are showing nanosecond resolution

Categories

Resources