Does python timeit consider setup in the count - python

I'm using python timeit to see how long it takes for a function to run.
setup = '''from __main__ import my_func
import random
l = list(range(10000))
random.shuffle(l)'''
timeit.timeit('my_func(l)', setup=setup, number=1000)
the results I'm getting are bigger than a 'normal' check with datetime.
Does timeit also count the time the setup takes, and if so - how can I disable it?

Does my_func(l) mutate l? That could affect the timings.
timeit will run the setup once and reuse the objects created by the setup each time it calls the code that is to be timed. Also it can call the code a few times to gauge roughly how fast it runs and choose the number of iterations before the actual timed run (though not when you've specified the number of runs yourself). That would mean if there is an initial fast run it won't be included in the timed results.
For example if my_func() was a badly written quicksort function it might run quickly when you call it on a shuffled list and very very slowly when you call it again with a (now sorted) list. timeit would only measure the very slow calls.

The docs say:
The execution time of setup is excluded from the overall timed
execution run.

The Python 2.0 docs are pretty clear that the setup statement is not timed:
Time number executions of the main statement. This executes the setup
statement once, and then returns the time it takes to execute the main
statement a number of times, measured in seconds as a float.
But if you're not sure, put a big, slow process into the setup statement and test to see what difference it makes.

Related

Show timeit progress

I have multiple functions I repeatedly want to measure execution time for using the builtin timeit library. (Say fun1, fun2 and fun3 are all depending on a couple subroutines, some of which I am trying to optimize. After every iteration, I want to know how fast my 3 top-level functions are executing)
The thing is, I am not sure in advance how long the functions are going to run, I just have a rough estimate. Using timeit.repeat(...) with a sufficient amount of repetitions/number of execution gives me a good estimate, but sometimes it takes very long because I accidentally slowed down one of the subroutines. It would be very handy to have a tqdm-like progress bar for the timing routine so I can estimate in advance for how long I have to wait until timing is done. I did not find any such feature in the timeit library, so here is the question:
Is it possible to show a (tqdm-like) progress bar when timing functions using timeit.repeat or timeit.timeit?
You can create your subclass of timeit.Timer that uses tqdm to track the total iterations performed.
from timeit import Timer, default_number
from tqdm import tqdm
import itertools
import gc
class ProgressTimer(Timer):
def timeit(self, number=default_number):
"""Time 'number' executions of the main statement.
To be precise, this executes the setup statement once, and
then returns the time it takes to execute the main statement
a number of times, as a float measured in seconds. The
argument is the number of times through the loop, defaulting
to one million. The main statement, the setup statement and
the timer function to be used are passed to the constructor.
"""
# wrap the iterator in tqdm
it = tqdm(itertools.repeat(None, number), total=number)
gcold = gc.isenabled()
gc.disable()
try:
timing = self.inner(it, self.timer)
finally:
if gcold:
gc.enable()
# the tqdm bar sometimes doesn't flush on short timers, so print an empty line
print()
return timing
To use this object, we just need to pass in the script we want to run. You can either define it as a string (like below) or you can simply open the file for reading and read to a variable.
py_setup = 'import numpy as np'
py_script = """
x = np.random.rand(1000)
x.sum()
"""
pt = ProgressTimer(py_script, setup=py_setup)
pt.timeit()
# prints / returns:
100%|███████████████████████████████████████████████| 1000000/1000000 [00:13<00:00, 76749.68it/s]
13.02982600001269
Looking at the source code of timeit, there is a template that gets executed when any timing is done. One could simply change that template to include a progress indicator:
import timeit
timeit.template = """
def inner(_it, _timer{init}):
from tqdm import tqdm
{setup}
_t0 = _timer()
for _i in tqdm(_it, total=_it.__length_hint__()):
{stmt}
_t1 = _timer()
return _t1 - _t0
"""
# some timeit test:
timeit.timeit(lambda: "-".join(map(str, range(100))), number=1000000)
Of course, this will influence the result, because the tqdm-calls are inside the _t0 and _t1 measurements. tqdm's documentation claims, that the overhead is only 60ns per iteration, though.

Timing Code Execution Time

So, I am interested in timing some of the code I am setting up. Borrowing a timer function from the 4th edition of Learning Python, I tried:
import time
reps = 100
repslist = range(reps)
def timer(func):
start = time.clock()
for i in repslist:
ret = func()
elasped = time.clock()-start
return elapsed
Then, I paste in whatever I want to time, and put:
print(timer(func)) #replace func with the function you want to time
When I run it on my code, I do get an answer, but it's nonsense. Suspecting something was wrong, I put a time.sleep(0.1) call in my code, and got a result of 0.8231
Does anybody know why this might be the case or how to fix it? I suspect that the time.clock() call might be at fault.
According to the help docs for clock:
Return the CPU time or real time since the start of the process or since the first call to clock(). This has as much precision as the system records.
The second call to clock already returns the elapsed time between it and the first clock call. You don't need to manually subtract start.
Change
elasped = time.clock()-start
to
elasped = time.clock()
If you want to timer a function perhaps give decorators a try(documentation here):
import time
def timeit(f):
def timed(*args, **kw):
ts = time.time()
result = f(*args, **kw)
te = time.time()
print 'func:%r args:[%r, %r] took: %2.4f sec' % \
(f.__name__, args, kw, te-ts)
return result
return timed
Then when you write a function you just use the decorator, here:
#timeit
def my_example_function():
for i in range(10000):
print "x"
This will print out the time the function took to execute:
func:'my_example_function' args:[(), {}] took: 0.4220 sec
After fixing the typo in the first intended use of elapsed, your code works fine with either time.clock or time.time (or Py3's time.monotonic for that matter) on my Linux system.
The difference would be in the (OS specific) behavior for clock; on most UNIX-like OSes it will return the processor time used by the program since it launched (so time spent blocked, on I/O, locks, page faults, etc. wouldn't count), while on Windows it's a wall clock timer (so time spent blocked would count) that counts seconds since first call.
The UNIX-like version of time.clock is also fairly unreliable if used in a long running program when clock_t is only 32 bits; the value it returns will wrap roughly every 72 minutes of processor time.
Of course, time.time isn't perfect either; it follows the system clock, so an NTP time update (or any other change to the system clock) occurring between calls will give erroneous results (on Python 3.3+, you'd use time.monotonic to avoid this problem). It's also not guaranteed to have granularity finer than 1 second, so if your function doesn't take an awfully long time to run, on a system with low res time.time you won't get particularly useful results.
Really, you should be looking at the Python batteries designed for this (that also handle issues like garbage collection overhead and the like). The timeit module already has a function that does what you want, but handles all the edge cases and issues I mentioned. For example, to time some global function named foo for 100 reps, you'd just do:
import timeit
def foo():
...
print(timeit.timeit('foo()', 'from __main__ import foo', number=100))
It fixes most of the issues I mention by selecting the best timing function for the OS you're on (and also fixes other sources of jitter, e.g. cyclic garbage collection, which is disabled during the test and reenabled at the end).
Even if you don't want to use that for some reason, if you're using Python 3.3 or higher, take a look at the replacements for time.clock, e.g. time.perf_counter (includes time spent sleeping) or time.process_time (includes only CPU time), both of which are portable, reliable, fast, and high resolution for better accuracy.
The time.sleep() will terminate for any signal. read about it here ...
http://www.tutorialspoint.com/python/time_sleep.htm

Best way to time a python function, and get that function's output?

To time a python function, the most common approach seems to involve the 'timeit' module, which returns the time that it took to run the function, but does not return the output of the function.
Is there a module which does the same thing as timeit, but which returns the output of the function, in addition to the running time, or is it necessary to implement that manually? If implementing this manually, what's a good timing function for this purpose, which is reasonably accurate, and which doesn't have a lot of overhead (options include, os.times(), datetime.now(), etc.)?
Likely a number of approaches to this problem, but here are two you may consider:
Run the function and store its output in a variable. Print the time.clock time after the function completes, but immediately before returning the output stored at the variable. The time complexity of the return statement is negligible w/r/t the function.
The above approach may be inappropriate if you are, say, comparing several implementations for both correctness and runtime. In that case, consider returning the function's output and the time.clock output in a list, which can then be accessed, stored in a struct, etc. Again, the function itself will majorize vs. the list operations and return.
As per the comment, use time.clock to get processor time precision.
You can try using time.time():
def timefunc(func):
from time import time
then = time()
func()
print time() - then
As such:
def foo():
from sys import stdout
from time import sleep
for i in range(1, 11):
stdout.write("\r%d" % i)
stdout.flush()
sleep(0.1)
stdout.write("\n")
>>> timefunc(foo)
10
1.01269602776
>>> timefunc(foo)
10
1.00967097282
>>> timefunc(foo)
10
1.01678395271
>>>

Matlab timeit equivalent in Python for scripts

In Matlab there is
"timeit(F), which measures the typical time (in seconds) required to run the function specified by the function handle F that takes no input argument."
This method returns the median of (I think 13) runs of the function.
Having looked at the methods time and timeit in Python, I can't quite find anything that will let me call from an IPython console, time my script (not function) a number of times, and return either an average or median.
Is there an easy way to do this? or at least time 1 execution, whereby I can make my own loop and average?
Thanks
You may want to look at this link and consider the %timeit magic from IPython
link
Example:
Say you define a function you want to test:
def logtest1(N):
tr=0.
for i in xrange(N):
T= 40. + 10.*random()
tr = tr + -log(random())/T
from timeit import Timer, timeit, repeat
runningtime = repeat("logtest1(int(10e5))", setup="from __main__ import logtest1", repeat=5, number=1)
print (runningtime)
That will run my function logtest1(int(10e5)) 1 time and store the time in the list runningtime then it will repeat the same thing 5 times and store the results in the same list. You can then take the average of the median of that list.

How to time how long a Python program takes to run?

Is there a simple way to time a Python program's execution?
clarification: Entire programs
Use timeit:
This module provides a simple way to time small bits of Python code. It has both command line as well as callable interfaces. It avoids a number of common traps for measuring execution times.
You'll need a python statement in a string; if you have a main function in your code, you could use it like this:
>>> from timeit import Timer
>>> timer = Timer('main()', 'from yourmodule import main')
>>> print timer.timeit()
The second string provides the setup, the environment for the first statement to be timed in. The second part is not being timed, and is intended for setting the stage as it were. The first string is then run through it's paces; by default a million times, to get accurate timings.
If you need more detail as to where things are slow, use one of the python profilers:
A profiler is a program that describes the run time performance of a program, providing a variety of statistics.
The easiest way to run this is by using the cProfile module from the command line:
$ python -m cProfile yourprogram.py
You might want to use built-in profiler.
Also you might want to measure function's running time by using following simple decorator:
import time
def myprof(func):
def wrapping_fun(*args):
start = time.clock()
result = func(*args)
end = time.clock()
print 'Run time of %s is %4.2fs' % (func.__name__, (end - start))
return result
return wrapping_fun
Usage:
#myprof
def myfun():
# function body
If you're on Linux/Unix/POSIX-combatible platform just use time. This way you won't interfere with you script and won't slow it down with unnecessarily detailed (for you) profiling. Naturally, you can use it for pretty much anything, not just Python scripts.
For snippets use the timeit module.
For entire programs use the cProfile module.
Use timeit
>>> import timeit
>>> t = timeit.Timer(stmt="lst = ['c'] * 100")
>>> print t.timeit()
1.10580182076
>>> t = timeit.Timer(stmt="lst = ['c' for x in xrange(100)]")
>>> print t.timeit()
7.66900897026

Categories

Resources