In Matlab there is
"timeit(F), which measures the typical time (in seconds) required to run the function specified by the function handle F that takes no input argument."
This method returns the median of (I think 13) runs of the function.
Having looked at the methods time and timeit in Python, I can't quite find anything that will let me call from an IPython console, time my script (not function) a number of times, and return either an average or median.
Is there an easy way to do this? or at least time 1 execution, whereby I can make my own loop and average?
Thanks
You may want to look at this link and consider the %timeit magic from IPython
link
Example:
Say you define a function you want to test:
def logtest1(N):
tr=0.
for i in xrange(N):
T= 40. + 10.*random()
tr = tr + -log(random())/T
from timeit import Timer, timeit, repeat
runningtime = repeat("logtest1(int(10e5))", setup="from __main__ import logtest1", repeat=5, number=1)
print (runningtime)
That will run my function logtest1(int(10e5)) 1 time and store the time in the list runningtime then it will repeat the same thing 5 times and store the results in the same list. You can then take the average of the median of that list.
Related
I have multiple functions I repeatedly want to measure execution time for using the builtin timeit library. (Say fun1, fun2 and fun3 are all depending on a couple subroutines, some of which I am trying to optimize. After every iteration, I want to know how fast my 3 top-level functions are executing)
The thing is, I am not sure in advance how long the functions are going to run, I just have a rough estimate. Using timeit.repeat(...) with a sufficient amount of repetitions/number of execution gives me a good estimate, but sometimes it takes very long because I accidentally slowed down one of the subroutines. It would be very handy to have a tqdm-like progress bar for the timing routine so I can estimate in advance for how long I have to wait until timing is done. I did not find any such feature in the timeit library, so here is the question:
Is it possible to show a (tqdm-like) progress bar when timing functions using timeit.repeat or timeit.timeit?
You can create your subclass of timeit.Timer that uses tqdm to track the total iterations performed.
from timeit import Timer, default_number
from tqdm import tqdm
import itertools
import gc
class ProgressTimer(Timer):
def timeit(self, number=default_number):
"""Time 'number' executions of the main statement.
To be precise, this executes the setup statement once, and
then returns the time it takes to execute the main statement
a number of times, as a float measured in seconds. The
argument is the number of times through the loop, defaulting
to one million. The main statement, the setup statement and
the timer function to be used are passed to the constructor.
"""
# wrap the iterator in tqdm
it = tqdm(itertools.repeat(None, number), total=number)
gcold = gc.isenabled()
gc.disable()
try:
timing = self.inner(it, self.timer)
finally:
if gcold:
gc.enable()
# the tqdm bar sometimes doesn't flush on short timers, so print an empty line
print()
return timing
To use this object, we just need to pass in the script we want to run. You can either define it as a string (like below) or you can simply open the file for reading and read to a variable.
py_setup = 'import numpy as np'
py_script = """
x = np.random.rand(1000)
x.sum()
"""
pt = ProgressTimer(py_script, setup=py_setup)
pt.timeit()
# prints / returns:
100%|███████████████████████████████████████████████| 1000000/1000000 [00:13<00:00, 76749.68it/s]
13.02982600001269
Looking at the source code of timeit, there is a template that gets executed when any timing is done. One could simply change that template to include a progress indicator:
import timeit
timeit.template = """
def inner(_it, _timer{init}):
from tqdm import tqdm
{setup}
_t0 = _timer()
for _i in tqdm(_it, total=_it.__length_hint__()):
{stmt}
_t1 = _timer()
return _t1 - _t0
"""
# some timeit test:
timeit.timeit(lambda: "-".join(map(str, range(100))), number=1000000)
Of course, this will influence the result, because the tqdm-calls are inside the _t0 and _t1 measurements. tqdm's documentation claims, that the overhead is only 60ns per iteration, though.
I'm using python timeit to see how long it takes for a function to run.
setup = '''from __main__ import my_func
import random
l = list(range(10000))
random.shuffle(l)'''
timeit.timeit('my_func(l)', setup=setup, number=1000)
the results I'm getting are bigger than a 'normal' check with datetime.
Does timeit also count the time the setup takes, and if so - how can I disable it?
Does my_func(l) mutate l? That could affect the timings.
timeit will run the setup once and reuse the objects created by the setup each time it calls the code that is to be timed. Also it can call the code a few times to gauge roughly how fast it runs and choose the number of iterations before the actual timed run (though not when you've specified the number of runs yourself). That would mean if there is an initial fast run it won't be included in the timed results.
For example if my_func() was a badly written quicksort function it might run quickly when you call it on a shuffled list and very very slowly when you call it again with a (now sorted) list. timeit would only measure the very slow calls.
The docs say:
The execution time of setup is excluded from the overall timed
execution run.
The Python 2.0 docs are pretty clear that the setup statement is not timed:
Time number executions of the main statement. This executes the setup
statement once, and then returns the time it takes to execute the main
statement a number of times, measured in seconds as a float.
But if you're not sure, put a big, slow process into the setup statement and test to see what difference it makes.
To time a python function, the most common approach seems to involve the 'timeit' module, which returns the time that it took to run the function, but does not return the output of the function.
Is there a module which does the same thing as timeit, but which returns the output of the function, in addition to the running time, or is it necessary to implement that manually? If implementing this manually, what's a good timing function for this purpose, which is reasonably accurate, and which doesn't have a lot of overhead (options include, os.times(), datetime.now(), etc.)?
Likely a number of approaches to this problem, but here are two you may consider:
Run the function and store its output in a variable. Print the time.clock time after the function completes, but immediately before returning the output stored at the variable. The time complexity of the return statement is negligible w/r/t the function.
The above approach may be inappropriate if you are, say, comparing several implementations for both correctness and runtime. In that case, consider returning the function's output and the time.clock output in a list, which can then be accessed, stored in a struct, etc. Again, the function itself will majorize vs. the list operations and return.
As per the comment, use time.clock to get processor time precision.
You can try using time.time():
def timefunc(func):
from time import time
then = time()
func()
print time() - then
As such:
def foo():
from sys import stdout
from time import sleep
for i in range(1, 11):
stdout.write("\r%d" % i)
stdout.flush()
sleep(0.1)
stdout.write("\n")
>>> timefunc(foo)
10
1.01269602776
>>> timefunc(foo)
10
1.00967097282
>>> timefunc(foo)
10
1.01678395271
>>>
I know I can use this
from timeit import timeit
test = lambda f: timeit(lambda:f(100), number=1)
t1 = test(hello)
To time how long it takes the hello function to run with argument 100. But let's say my hello function returns the string 'world' and I want to store that returned value and also have to the time it took to execute.
Can that be done?
If you just want to time specific calls to specific functions, you can do the following. This will get you the exact time of one run. However, you probably just want to use timeit or profile to get a more exact picture of what your program is doing. Those tools aggregate multiple runs to get a more accurate average case result.
Make a function that takes a function to call, records the time, then runs the function and returns the value and time delta.
import time
def timed(func, *args, **kwargs):
start = time.time()
out = func(*args, **kwargs)
return out, time.time() - start
def my_func(value):
time.sleep(value)
return value
print(timed(my_func(5))) # (5, 5.0050437450408936)
Is there any significant difference between:
from time import time
start = time()
# some process
print time() - start
and:
from timeit import timeit
def my_funct():
# some process
print timeit(my_funct, number=1)
For an example, I'll use Project Euler 1 (because it's really easy to understand/solve)
def pE1test1(): # using time()
from time import time
start = time()
print sum([n for n in range(1, 1000) if n%3==0 or n%5==0])
print time() - start
def pE1test2(): # using timeit
print sum([n for n in range(1, 1000) if n%3==0 or n%5==0])
from timeit import timeit
pE1test1()
print timeit(pE1test2, number=1)
This outputs:
>>>
233168
0.0090000629425
233168
0.00513921300363
What is the major difference between timeit and time?
timeit will use the best available timing function on your system. See the docs on timeit.default_timer.
Also, timeit turns off the garbage collector.
Also, I believe you're using timeit wrong. You should be passing a string as per the last example in the documentation:
print timeit("pE1test2()","from __main__ import PE1test2",number=1)
And of course, another major difference is that timeit makes it trivial to time the execution of the function for thousands of iterations (which is the only time a timing result is meaningful). This decreases the importance of a single run taking longer than the others (e.g. due to your system resources being hogged by some other program).
The purposes of the two modules are very different.
The time module provides low-level access to various time/date functions provided by the underlying system.
The timeit module is specifically for running performance tests.
As you point out, you can do simple timing using the functions in time, but there are a number of common pitfalls that people fall into when trying to do performance testing. timeit tries to mitigate those in order to get repeatable numbers that can be sensibly compared.