So, I am interested in timing some of the code I am setting up. Borrowing a timer function from the 4th edition of Learning Python, I tried:
import time
reps = 100
repslist = range(reps)
def timer(func):
start = time.clock()
for i in repslist:
ret = func()
elasped = time.clock()-start
return elapsed
Then, I paste in whatever I want to time, and put:
print(timer(func)) #replace func with the function you want to time
When I run it on my code, I do get an answer, but it's nonsense. Suspecting something was wrong, I put a time.sleep(0.1) call in my code, and got a result of 0.8231
Does anybody know why this might be the case or how to fix it? I suspect that the time.clock() call might be at fault.
According to the help docs for clock:
Return the CPU time or real time since the start of the process or since the first call to clock(). This has as much precision as the system records.
The second call to clock already returns the elapsed time between it and the first clock call. You don't need to manually subtract start.
Change
elasped = time.clock()-start
to
elasped = time.clock()
If you want to timer a function perhaps give decorators a try(documentation here):
import time
def timeit(f):
def timed(*args, **kw):
ts = time.time()
result = f(*args, **kw)
te = time.time()
print 'func:%r args:[%r, %r] took: %2.4f sec' % \
(f.__name__, args, kw, te-ts)
return result
return timed
Then when you write a function you just use the decorator, here:
#timeit
def my_example_function():
for i in range(10000):
print "x"
This will print out the time the function took to execute:
func:'my_example_function' args:[(), {}] took: 0.4220 sec
After fixing the typo in the first intended use of elapsed, your code works fine with either time.clock or time.time (or Py3's time.monotonic for that matter) on my Linux system.
The difference would be in the (OS specific) behavior for clock; on most UNIX-like OSes it will return the processor time used by the program since it launched (so time spent blocked, on I/O, locks, page faults, etc. wouldn't count), while on Windows it's a wall clock timer (so time spent blocked would count) that counts seconds since first call.
The UNIX-like version of time.clock is also fairly unreliable if used in a long running program when clock_t is only 32 bits; the value it returns will wrap roughly every 72 minutes of processor time.
Of course, time.time isn't perfect either; it follows the system clock, so an NTP time update (or any other change to the system clock) occurring between calls will give erroneous results (on Python 3.3+, you'd use time.monotonic to avoid this problem). It's also not guaranteed to have granularity finer than 1 second, so if your function doesn't take an awfully long time to run, on a system with low res time.time you won't get particularly useful results.
Really, you should be looking at the Python batteries designed for this (that also handle issues like garbage collection overhead and the like). The timeit module already has a function that does what you want, but handles all the edge cases and issues I mentioned. For example, to time some global function named foo for 100 reps, you'd just do:
import timeit
def foo():
...
print(timeit.timeit('foo()', 'from __main__ import foo', number=100))
It fixes most of the issues I mention by selecting the best timing function for the OS you're on (and also fixes other sources of jitter, e.g. cyclic garbage collection, which is disabled during the test and reenabled at the end).
Even if you don't want to use that for some reason, if you're using Python 3.3 or higher, take a look at the replacements for time.clock, e.g. time.perf_counter (includes time spent sleeping) or time.process_time (includes only CPU time), both of which are portable, reliable, fast, and high resolution for better accuracy.
The time.sleep() will terminate for any signal. read about it here ...
http://www.tutorialspoint.com/python/time_sleep.htm
Related
I am working on a Python script that is going to be run in the command line. The idea is to get a command from the user, run it and then provide the wall-clock time and the CPU time of the command provided by the user. See code below.
#!/usr/bin/env python
import os
import sys
def execut_cmd(cmd_line):
utime = os.system('time '+cmd_line)
# Here I would like to store the wall-clock time in the Python variable
# utime.
cputime = os.system('time '+cmd_line)
# Here the CPU time in the cputime variable. utime and cputime are going to
# be used later in my Python script. In addition, I would like to silence the
# output of time in the screen.
execut_cmd(sys.argv[1])
print ('Your command wall-clock time is '+utime)
print ('Your command cpu time is '+ cputime)
How can I accomplish this? Also, if there is a better method than using 'time' I am open to try it.
From Python Documentation for wall time:
... On Windows, time.clock() has microsecond granularity, but time.time()’s granularity is 1/60th of a second. On Unix, time.clock() has 1/100th of a second granularity, and time.time() is much more precise. On either platform, default_timer() measures wall clock time, not the CPU time. This means that other processes running on the same computer may interfere with the timing.
For wall time you can use timeit.default_timer() which gets the timer with best granularity described above.
From Python 3.3 and above you can use time.process_time() or time.process_time_ns() . Below is the documentation entry for process_time method:
Return the value (in fractional seconds) of the sum of the system and user CPU time of the current process. It does not include time elapsed during sleep. It is process-wide by definition. The reference point of the returned value is undefined, so that only the difference between the results of consecutive calls is valid.
To provide the current wall time, time.time() can be used to get the epoch time.
To provide the elapsed wall time, time.perf_counter() can be used at the start and end of the operation with the difference in results reflecting the elapsed time. The results cannot be used to give an absolute time as the reference point is undefined. As mentioned in other answers, you can use timeit.default_time() but this will always return time.perf_counter() as of python 3.3
To provide the elapsed CPU time, time.process_time() can be used in a similar manner to time.perf_counter(). This will provide the sum of the system and user CPU time.
With the little time I have spent using the timing functions on Linux systems. I have observed that
timeit.default_timer() and time.perf_counter() numerically gives the same result.
Also, when measuring the duration of a time interval, timeit.default_timer(), time.perf_counter() and time.time() all virtually gives the same result. So this means that any of these functions can be used to measure the elapsed time or wall time for any process.
I think I should also mention that the difference between time.time() and others is that it gives the current time in seconds from epoch which is from 1 January 1970 12:00AM
time.clock() and time.process_time() also gives the same numerical value
time.process_time() is most suitable for measuring the cpu time since time.clock() is already deprecated in python 3
I am getting a timestamp every time a key is pressed like this:
init_timestamp = time.time()
while (True):
c = getch()
offset = time.time() - init_timestamp
print("%s,%s" % (c,offset), file=f)
(getch from this answer).
I am verifying the timestamps against an audio recording of me actually typing the keys. After lining the first timestamp up with the waveform, subsequent timestamps drift slighty but consistently. By this I mean that the saved timestamps are later than the keypress waveforms and get later and later as time goes on.
I am reasonably sure the waveform timing is correct (i.e. the recording is not fast or slow), because in the recording I also included the ticking of a very accurate clock which lines up perfectly with the second markers.
I am aware that there are unavoidable limits to the accuracy of time.time(), but this does not seem to account for what I'm seeing - if it was equally wrong on both sides that would be acceptable, but I do not want it to gradually diverge more and more from the truth.
Why would I be seeing this drifting behaviour and what can I do to avoid it?
Just solved this by using time.monotonic() instead of time.time(). time.time() seems to use gettimeofday (at least here it does) which is apparently really bad for measuring walltime differences because of NTP syncing issues:
gettimeofday() and time() should only be used to get the current time if the current wall-clock time is actually what you want. They should never be used to measure time or schedule an event X time into the future.
You usually aren't running NTP on your wristwatch, so it probably won't jump a second or two (or 15 minutes) in a random direction because it happened to sync up against a proper clock at that point. Good NTP implementations try to not make the time jump like this. They instead make the clock go faster or slower so that it will drift to the correct time. But while it's drifting you either have a clock that's going too fast or too slow. It's not measuring the passage of time properly.
(link). So basically measuring differences between time.time() calls is a bad idea.
Depending on which OS you are using you will either need to use time.time() or time.clock().
For windows OS's you will need to use time.clock this give you will clock seconds as a float. time.time() on windows if I remember correctly time.time() is only accurate within 16ms.
For posix systems (linux, osx) you should be using time.time() this is a float which returns the number of seconds since the epoch.
In your code add the following to make your application a little more cross system compatible.
import os
if os.name == 'posix':
from time import time as get_time
else:
from time import clock as get_time
# now use get_time() to return the timestamp
init_timestamp = get_time()
while (True):
c = getch()
offset = get_time() - init_timestamp
print("%s,%s" % (c,offset), file=f)
...
import time
word = {"success":0, "desire":0, "effort":0, ...}
def cleaner(x):
dust = ",./<>?;''[]{}\=+_)(*&^%$##!`~"
for letter in x:
if letter in dust:
x = x[0:x.index(letter)]+x[x.index(letter)+1:]
else:
pass
return x #alhamdlillah it worked 31.07.12
print "input text to analyze"
itext = cleaner(raw_input()).split()
t = time.clock()
for iword in itext:
if iword in word:
word[iword] += 1
else:
pass
print t
print len(itext)
every time i call the code, t will increase. can anyone explain the underlying concept/reason behind this. perhaps in terms of system process? thank you very much, programming lads.
Because you're printing out the current time each time you run the script
That's how time works, it advances, constantly.
If you want to measure the time taken for your for loop (between the first call to time.clock() and the end), print out the difference in times:
print time.clock() - t
You are printing the current time... of course it increases every time you run the code.
From the python documentation for time.clock():
On Unix, return the current processor time as a floating point number
expressed in seconds. The precision, and in fact the very definition
of the meaning of “processor time”, depends on that of the C function
of the same name, but in any case, this is the function to use for
benchmarking Python or timing algorithms.
On Windows, this function returns wall-clock seconds elapsed since the
first call to this function, as a floating point number, based on the
Win32 function QueryPerformanceCounter(). The resolution is typically
better than one microsecond.
time.clock() returns the elapsed CPU time since the process was created. CPU time is based on how many cycles the CPU spent in the context of the process. It is a monotonic function during the lifetime of a process, i.e. if you call time.clock() several times in the same execution, you will get a list of increasing numbers. The difference between two successive invocations of clock() could be less than the elasped wall-clock time or more, depending on wheather the CPU was not running at 100% (e.g. there was some waiting for I/O) or if you have a multithreaded program which consumes more than 100% of CPU time (e.g. multicore CPU with 2 threads using 75% each -> you'd get 150% of the wall-clock time). But if you call clock() once in one process, then rerun the program again, you might get lower value than the one before, if it takes less time to process the input in the new process.
What you should be doing instead is to use time.time() which returns the current Unix timestamp with fractional (subsecond) precision. You should call it once before the processing is started and once after that and subtract the two values in order to compute the wall-clock time elapsed between the two invocations.
Note that on Windows time.clock() returns the elapsed wall-clock time since the process was started. It is like calling time.time() immediately at the beginning of the script and then subtracting the value from later calls to time.time().
There is a really good library called jackedCodeTimerPy that works better than the time module. It also has some clever error checking so you may want to try it out.
Using jackedCodeTimerPy your code should look like this:
# import time
from jackedCodeTimerPY import JackedTiming
JTimer = JackedTiming()
word = {"success":0, "desire":0, "effort":0}
def cleaner(x):
dust = ",./<>?;''[]{}\=+_)(*&^%$##!`~"
for letter in x:
if letter in dust:
x = x[0:x.index(letter)]+x[x.index(letter)+1:]
else:
pass
return x #alhamdlillah it worked 31.07.12
print "input text to analyze"
itext = cleaner(raw_input()).split()
# t = time.clock()
JTimer.start('timer_1')
for iword in itext:
if iword in word:
word[iword] += 1
else:
pass
# print t
JTimer.stop('timer_1')
print JTimer.report()
print len(itext)
It gives really good reports like
label min max mean total run count
------- ----------- ----------- ----------- ----------- -----------
imports 0.00283813 0.00283813 0.00283813 0.00283813 1
loop 5.96046e-06 1.50204e-05 6.71864e-06 0.000335932 50
I like how it gives you statistics on it and the number of times the timer is run.
It's simple to use. If i want to measure the time code takes in a for loop i just do the following:
from jackedCodeTimerPY import JackedTiming
JTimer = JackedTiming()
for i in range(50):
JTimer.start('loop') # 'loop' is the name of the timer
doSomethingHere = 'This is really useful!'
JTimer.stop('loop')
print(JTimer.report()) # prints the timing report
You can can also have multiple timers running at the same time.
JTimer.start('first timer')
JTimer.start('second timer')
do_something = 'amazing'
JTimer.stop('first timer')
do_something = 'else'
JTimer.stop('second timer')
print(JTimer.report()) # prints the timing report
There are more use example in the repo. Hope this helps.
https://github.com/BebeSparkelSparkel/jackedCodeTimerPY
versus something like this:
def time_this(func):
#functools.wraps(func)
def what_time_is_it(*args, **kwargs):
start_time = time.clock()
print 'STARTING TIME: %f' % start_time
result = func(*args, **kwargs)
end_time = time.clock()
print 'ENDING TIME: %f' % end_time
print 'TOTAL TIME: %f' % (end_time - start_time)
return result
return what_time_is_it
I am asking because writing a descriptor likes this seems easier and clearer to me. I recognize that profile/cprofile attempts to estimate bytecode compilation time and such(and subtracts those times from the running time), so more specifically.
I am wondering:
a) when does compilation time become significant enough for such
differences to matter?
b) How might I go about writing my own profiler that takes into
account compilation time?
Profile is slower than cProfile, but does support Threads.
cProfile is a lot faster, but AFAIK it won't profile threads (only the main one, the others will be ignored).
Profile and cProfile have nothing to do with estimating compilation time. They estimate run time.
Compilation time isn't a performance issue. Don't want your code to be compiled every time it's run? import it, and it will be saved as a .pyc, and only recompiled if you change it. It simply doesn't matter how long code takes to compile (it's very fast) since this doesn't have to be done every time it's run.
If you want to time compilation, you can use the compiler package.
Basically:
from timeit import timeit
print timeit('compiler.compileFile(' + filename + ')', 'import compiler', number=100)
will print the time it takes to compile filename 100 times.
If inside func, you append to some lists, do some addition, look up some variables in dictionaries, profile will tell you how long each of those things takes.
Your version doesn't tell you any of those things. It's also pretty inaccurate -- the time you get depends on the time it takes to look up the clock attribute of time and then call it.
If what you want is to time a short section of code, use timeit. If you want to profile code, use profile or cProfile. If what you want to know is how long arbitrary code took to run, but not what parts of it where the slowest, then your version is fine, so long as the code doesn't take just a few miliseconds.
I need to measure the time certain parts of my program take (not for debugging but as a feature in the output). Accuracy is important because the total time will be a fraction of a second.
I was going to use the time module when I came across timeit, which claims to avoid a number of common traps for measuring execution times. Unfortunately it has an awful interface, taking a string as input which it then eval's.
So, do I need to use this module to measure time accurately, or will time suffice? And what are the pitfalls it refers to?
Thanks
According to the Python documentation, it has to do with the accuracy of the time function in different operating systems:
The default timer function is platform
dependent. On Windows, time.clock()
has microsecond granularity but
time.time()‘s granularity is 1/60th of
a second; on Unix, time.clock() has
1/100th of a second granularity and
time.time() is much more precise. On
either platform, the default timer
functions measure wall clock time, not
the CPU time. This means that other
processes running on the same computer
may interfere with the timing ... On Unix, you can
use time.clock() to measure CPU time.
To pull directly from timeit.py's code:
if sys.platform == "win32":
# On Windows, the best timer is time.clock()
default_timer = time.clock
else:
# On most other platforms the best timer is time.time()
default_timer = time.time
In addition, it deals directly with setting up the runtime code for you. If you use time you have to do it yourself. This, of course saves you time
Timeit's setup:
def inner(_it, _timer):
#Your setup code
%(setup)s
_t0 = _timer()
for _i in _it:
#The code you want to time
%(stmt)s
_t1 = _timer()
return _t1 - _t0
Python 3:
Since Python 3.3 you can use time.perf_counter() (system-wide timing) or time.process_time() (process-wide timing), just the way you used to use time.clock():
from time import process_time
t = process_time()
#do some stuff
elapsed_time = process_time() - t
The new function process_time will not include time elapsed during sleep.
Python 3.7+:
Since Python 3.7 you can also use process_time_ns() which is similar to process_time()but returns time in nanoseconds.
You could build a timing context (see PEP 343) to measure blocks of code pretty easily.
from __future__ import with_statement
import time
class Timer(object):
def __enter__(self):
self.__start = time.time()
def __exit__(self, type, value, traceback):
# Error handling here
self.__finish = time.time()
def duration_in_seconds(self):
return self.__finish - self.__start
timer = Timer()
with timer:
# Whatever you want to measure goes here
time.sleep(2)
print timer.duration_in_seconds()
The timeit module looks like it's designed for doing performance testing of algorithms, rather than as simple monitoring of an application. Your best option is probably to use the time module, call time.time() at the beginning and end of the segment you're interested in, and subtract the two numbers. Be aware that the number you get may have many more decimal places than the actual resolution of the system timer.
I was annoyed too by the awful interface of timeit so i made a library for this, check it out its trivial to use
from pythonbenchmark import compare, measure
import time
a,b,c,d,e = 10,10,10,10,10
something = [a,b,c,d,e]
def myFunction(something):
time.sleep(0.4)
def myOptimizedFunction(something):
time.sleep(0.2)
# comparing test
compare(myFunction, myOptimizedFunction, 10, input)
# without input
compare(myFunction, myOptimizedFunction, 100)
https://github.com/Karlheinzniebuhr/pythonbenchmark
Have you reviewed the functionality provided profile or cProfile?
http://docs.python.org/library/profile.html
This provides much more detailed information than just printing the time before and after a function call. Maybe worth a look...
The documentation also mentions that time.clock() and time.time() have different resolution depending on platform. On Unix, time.clock() measures CPU time as opposed to wall clock time.
timeit also disables garbage collection when running the tests, which is probably not what you want for production code.
I find that time.time() suffices for most purposes.
From Python 2.6 on timeit is not limited to input string anymore. Citing the documentation:
Changed in version 2.6: The stmt and setup parameters can now also take objects that are callable without arguments. This will embed calls to them in a timer function that will then be executed by timeit(). Note that the timing overhead is a little larger in this case because of the extra function calls.