How to time how long a Python program takes to run? - python

Is there a simple way to time a Python program's execution?
clarification: Entire programs

Use timeit:
This module provides a simple way to time small bits of Python code. It has both command line as well as callable interfaces. It avoids a number of common traps for measuring execution times.
You'll need a python statement in a string; if you have a main function in your code, you could use it like this:
>>> from timeit import Timer
>>> timer = Timer('main()', 'from yourmodule import main')
>>> print timer.timeit()
The second string provides the setup, the environment for the first statement to be timed in. The second part is not being timed, and is intended for setting the stage as it were. The first string is then run through it's paces; by default a million times, to get accurate timings.
If you need more detail as to where things are slow, use one of the python profilers:
A profiler is a program that describes the run time performance of a program, providing a variety of statistics.
The easiest way to run this is by using the cProfile module from the command line:
$ python -m cProfile yourprogram.py

You might want to use built-in profiler.
Also you might want to measure function's running time by using following simple decorator:
import time
def myprof(func):
def wrapping_fun(*args):
start = time.clock()
result = func(*args)
end = time.clock()
print 'Run time of %s is %4.2fs' % (func.__name__, (end - start))
return result
return wrapping_fun
Usage:
#myprof
def myfun():
# function body

If you're on Linux/Unix/POSIX-combatible platform just use time. This way you won't interfere with you script and won't slow it down with unnecessarily detailed (for you) profiling. Naturally, you can use it for pretty much anything, not just Python scripts.

For snippets use the timeit module.
For entire programs use the cProfile module.

Use timeit
>>> import timeit
>>> t = timeit.Timer(stmt="lst = ['c'] * 100")
>>> print t.timeit()
1.10580182076
>>> t = timeit.Timer(stmt="lst = ['c' for x in xrange(100)]")
>>> print t.timeit()
7.66900897026

Related

How to benchmark a C program from a python script?

I'm currently doing some work in uni that requires generating multiple benchmarks for multiple short C programs. I've written a python script to automate this process. Up until now I've been using the time module and essentially calculating the benchmark as such:
start = time.time()
successful = run_program(path)
end = time.time()
runtime = end - start
where the run_program function just uses the subprocess module to run the C program:
def run_program(path):
p = subprocess.Popen(path, shell=True, stdout=subprocess.PIPE)
p.communicate()[0]
if (p.returncode > 1):
return False
return True
However I've recently discovered that this measures elapsed time and not CPU time, i.e. this sort of measurement is sensitive to noise from the OS. Similar questions on SO suggest that the timeit module is is better for measuring CPU time, so I've adapted the run method as such:
def run_program(path):
command = 'p = subprocess.Popen(\'time ' + path + '\', shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE); out, err = p.communicate()'
result = timeit.Timer(command, setup='import subprocess').repeat(1, 10)
return numpy.median(result)
But from looking at the timeit documentation it seems that the timeit module is only meant for small snippets of python code passed in as a string. So I'm not sure if timeit is giving me accurate results for this computation. So my question is: Will timeit measure the CPU for every step of the process that it runs or will it only measure the CPU time for the actual python(i.e. the subprocess module) code to run? Is this an accurate way to benchmark a set of C programs?
timeit will measure the CPU time used by the Python process in which it runs. Execution time of external processes will not be "credited" to those times.
A more accurate way would be to do it in C, where you can get true speed and throughput.

Best way to time a python function, and get that function's output?

To time a python function, the most common approach seems to involve the 'timeit' module, which returns the time that it took to run the function, but does not return the output of the function.
Is there a module which does the same thing as timeit, but which returns the output of the function, in addition to the running time, or is it necessary to implement that manually? If implementing this manually, what's a good timing function for this purpose, which is reasonably accurate, and which doesn't have a lot of overhead (options include, os.times(), datetime.now(), etc.)?
Likely a number of approaches to this problem, but here are two you may consider:
Run the function and store its output in a variable. Print the time.clock time after the function completes, but immediately before returning the output stored at the variable. The time complexity of the return statement is negligible w/r/t the function.
The above approach may be inappropriate if you are, say, comparing several implementations for both correctness and runtime. In that case, consider returning the function's output and the time.clock output in a list, which can then be accessed, stored in a struct, etc. Again, the function itself will majorize vs. the list operations and return.
As per the comment, use time.clock to get processor time precision.
You can try using time.time():
def timefunc(func):
from time import time
then = time()
func()
print time() - then
As such:
def foo():
from sys import stdout
from time import sleep
for i in range(1, 11):
stdout.write("\r%d" % i)
stdout.flush()
sleep(0.1)
stdout.write("\n")
>>> timefunc(foo)
10
1.01269602776
>>> timefunc(foo)
10
1.00967097282
>>> timefunc(foo)
10
1.01678395271
>>>

How do I automate an environment variable dependent benchmark of BLAS in python/numpy?

I need some help in figuring out how to automate a benchmark effort in python.
I'm testing the effects of threading on a BLAS library calls through numpy in python. In a linux environment, threading in OpenBLAS is controlled through the environment variable OMP_NUM_THREADS. I want to do a test where I increment OMP_NUM_THREADS from 1 to a max value, time a routine at each thread count, and then finally manipulate the aggregate timing for all thread counts.
The issue is the following. Environment variables can be set in python, but they only affect subprocesses or subshells. So I can correctly run my benchmark with the following driver code:
#!/usr/bin/env python # driver script for thread test
import os
thread_set =[1,2,4,8,16]
for thread in thread_set:
os.environ['OMP_NUM_THREADS']='{:d}'.format(thread)
os.system("echo $OMP_NUM_THREADS")
os.system("numpy_test")
and numpy_test script:
#!/usr/bin/env python
#timing test for numpy dot product (using OpenBLAS)
#based on http://stackoverflow.com/questions/11443302/compiling-numpy-with-openblas-integration
import sys
import timeit
setup = "import numpy; x = numpy.random.random((1000,1000))"
count = 5
t = timeit.Timer("numpy.dot(x, x.T)", setup=setup)
dot_time = t.timeit(count)/count
print("dot: {:7.3g} sec".format(dot_time))
but analyzing this is a very manual process.
In particular, I can't return the value dot_time from numpy_test up to my outer wrapper routine, so I can't analyze the results of my test in any automated fashion. As an example, I'd like to plot dot_time vs number of threads, or evaluate whether dot_time/number of threads is constant.
If I try to do a similar test entirely within a python instance by defining a python test function (avoiding the os.system() approach above), and then running the test function within the thread in thread_set loop, then all instances of the test function inherit the same value for OMP_NUM_THREADS (that of the parent python shell). So this test fails:
#!/usr/bin/env python
#attempt at testing threads that doesn't work
#(always uses inherited value of OMP_NUM_THREADS)
import os
import sys
import timeit
def test_numpy():
setup = "import numpy; x = numpy.random.random((1000,1000))"
count = 5
t = timeit.Timer("numpy.dot(x, x.T)", setup=setup)
dot_time = t.timeit(count)/count
print("dot: {:7.3g} sec".format(dot_time))
return dot_time
thread_set =[1,2,4,8,16]
for thread in thread_set:
os.environ['OMP_NUM_THREADS']='{:d}'.format(thread)
os.system("echo $OMP_NUM_THREADS")
time_to_run = test_numpy()
print(time_to_run)
This fails in that every instance of thread takes the same time, as test_numpy() always inherits the value of OMP_NUM_THREADS in the parent environment rather than the value set through os.environ(). If something like this worked however, it would be trivial to do the analysis I need to do.
In the real test, I'll be running over a few 1000 permutations, so automation is key. Given that, I'd appreciate an answer to any of these questions:
How would you return a value (dot_time) from a subprocess like this? Is there a more elegant solution than reading/writing a file?
Is there a better way to structure this sort of (environment variable dependent) test?
Thank you in advance.
You can do something like this:
import subprocess
os.environ['OMP_NUM_THREADS'] = '{:d}'.format(thread)
proc = subprocess.Popen(["numpy_test"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = proc.communicate()
Then you'll have the output of the the numpy_test script in stdout. In general I believe subprocess.call and subprocess.Popen are prefered over os.system.
If you want to get the output from the subprocess, use subprocess.check_output, e.g. replace
os.system("numpy_test")
with
dot_output = subprocess.check_output(["numpy_test"])
dot_time = ... # extract time from dot_output

Python question about time spent

I would like to know that how much time a particular function has spent during the duration of the program which involves recursion, what is the best way of doing it?
Thank you
The best way would be to run some benchmark tests (to test individual functions) or Profiling (to test an entire application/program). Python comes with built-in Profilers.
Alternatively, you could go back to the very basics by simply setting a start time at the beginning of the program, and, at the end of the program, subtracting the current time from the start time. This is basically very simple Benchmarking.
Here is an implementation from the an answer from the linked question:
import time
start = time.time()
do_long_code()
print "it took", time.time() - start, "seconds."
Python has something for benchmarking included in its standard library, as well.
From the example give on the page:
def test():
"Time me"
L = []
for i in range(100):
L.append(i)
if __name__=='__main__':
from timeit import Timer
t = Timer("test()", "from __main__ import test")
print t.timeit()
Use the profiler!
python -m cProfile -o prof yourscript.py
runsnake prof
runsnake is a nice tool for looking at the profiling output. You can of course use other tools.
More on the Profiler here: http://docs.python.org/library/profile.html

accurately measure time python function takes

I need to measure the time certain parts of my program take (not for debugging but as a feature in the output). Accuracy is important because the total time will be a fraction of a second.
I was going to use the time module when I came across timeit, which claims to avoid a number of common traps for measuring execution times. Unfortunately it has an awful interface, taking a string as input which it then eval's.
So, do I need to use this module to measure time accurately, or will time suffice? And what are the pitfalls it refers to?
Thanks
According to the Python documentation, it has to do with the accuracy of the time function in different operating systems:
The default timer function is platform
dependent. On Windows, time.clock()
has microsecond granularity but
time.time()‘s granularity is 1/60th of
a second; on Unix, time.clock() has
1/100th of a second granularity and
time.time() is much more precise. On
either platform, the default timer
functions measure wall clock time, not
the CPU time. This means that other
processes running on the same computer
may interfere with the timing ... On Unix, you can
use time.clock() to measure CPU time.
To pull directly from timeit.py's code:
if sys.platform == "win32":
# On Windows, the best timer is time.clock()
default_timer = time.clock
else:
# On most other platforms the best timer is time.time()
default_timer = time.time
In addition, it deals directly with setting up the runtime code for you. If you use time you have to do it yourself. This, of course saves you time
Timeit's setup:
def inner(_it, _timer):
#Your setup code
%(setup)s
_t0 = _timer()
for _i in _it:
#The code you want to time
%(stmt)s
_t1 = _timer()
return _t1 - _t0
Python 3:
Since Python 3.3 you can use time.perf_counter() (system-wide timing) or time.process_time() (process-wide timing), just the way you used to use time.clock():
from time import process_time
t = process_time()
#do some stuff
elapsed_time = process_time() - t
The new function process_time will not include time elapsed during sleep.
Python 3.7+:
Since Python 3.7 you can also use process_time_ns() which is similar to process_time()but returns time in nanoseconds.
You could build a timing context (see PEP 343) to measure blocks of code pretty easily.
from __future__ import with_statement
import time
class Timer(object):
def __enter__(self):
self.__start = time.time()
def __exit__(self, type, value, traceback):
# Error handling here
self.__finish = time.time()
def duration_in_seconds(self):
return self.__finish - self.__start
timer = Timer()
with timer:
# Whatever you want to measure goes here
time.sleep(2)
print timer.duration_in_seconds()
The timeit module looks like it's designed for doing performance testing of algorithms, rather than as simple monitoring of an application. Your best option is probably to use the time module, call time.time() at the beginning and end of the segment you're interested in, and subtract the two numbers. Be aware that the number you get may have many more decimal places than the actual resolution of the system timer.
I was annoyed too by the awful interface of timeit so i made a library for this, check it out its trivial to use
from pythonbenchmark import compare, measure
import time
a,b,c,d,e = 10,10,10,10,10
something = [a,b,c,d,e]
def myFunction(something):
time.sleep(0.4)
def myOptimizedFunction(something):
time.sleep(0.2)
# comparing test
compare(myFunction, myOptimizedFunction, 10, input)
# without input
compare(myFunction, myOptimizedFunction, 100)
https://github.com/Karlheinzniebuhr/pythonbenchmark
Have you reviewed the functionality provided profile or cProfile?
http://docs.python.org/library/profile.html
This provides much more detailed information than just printing the time before and after a function call. Maybe worth a look...
The documentation also mentions that time.clock() and time.time() have different resolution depending on platform. On Unix, time.clock() measures CPU time as opposed to wall clock time.
timeit also disables garbage collection when running the tests, which is probably not what you want for production code.
I find that time.time() suffices for most purposes.
From Python 2.6 on timeit is not limited to input string anymore. Citing the documentation:
Changed in version 2.6: The stmt and setup parameters can now also take objects that are callable without arguments. This will embed calls to them in a timer function that will then be executed by timeit(). Note that the timing overhead is a little larger in this case because of the extra function calls.

Categories

Resources