avoid expensive setup in timeit.repeat() benchmark

avoid expensive setup in timeit.repeat() benchmark - python

I'm trying to measure the execution time of a small Python code snippet of mine and I'm wondering what's the best way to do so.
Ideally, I would like to run some sort of setup (which takes a loooong time), then run some test code a couple of times, and get the minimum time of these runs.
timeit() seemed appropriate, but I'm not sure how to obtain the minimum time without re-executing the setup. Small code snippet demonstrating the question:
import timeit
setup = 'a = 2.0' # expensive
stmt = 'b = a**2' # also takes significantly longer than timer resolution
# this executes setup and stmt 10 times and the minimum of these 10
# runs is returned:
timings1 = timeit.repeat(stmt = stmt, setup = setup, repeat = 10, number = 1)
# this executes setup once and stmt 10 times but the overall time of
# these 10 runs is returned (and I would like to have the minimum
# of the 10 runs):
timings2 = timeit.repeat(stmt = stmt, setup = setup, repeat = 1, number = 10)

Have you tried using datetime to do your timing for you?
start = datetime.datetime.now()
print datetime.datetime.now() - start #prints a datetime.timedelta object`
That will give you the time elapsed and you can control where it started with tiny overhead.
Edit: This is a video of a guy who uses it also for doing some timing, it seems to be the easiest way to get the running time. http://www.youtube.com/watch?v=Iw9-GckD-gQ

Related

How to schedule real time cyclic task?

We are a team of bachelor students currently working on building a legged robot. At the moment our interface to the robot is written in python using an sdk from the master board we are using.
In order to communicate with the master board sdk, we need to send a command every millisecond.
To allow us to send tasks periodically, we have applied the rt-preempt patch to our linux kernel. (Ubuntu LTS 20.04, kernel 5.10.27-rt36)
We are very new to writing real time applications, and have run into some issues where our task sometimes will have a much smaller time step than specified. In the figure below we have plotted the time of each cycle of the while loop where the command is being sent to the sdk. (x axis is time in seconds and y axis is the elapsed time of an iteration, also in seconds)
As seen in the plot, one step is much smaller than the rest. This seems to happen at the same exact time mark every time we run the script.
cyclic_task_plot
We set the priority of the entire script using:
pid = os.getpid()
sched = os.SCHED_FIFO
param = os.sched_param(98)
os.sched_setscheduler(pid, sched, param)
Our cyclic task looks like this:
dt is set to 0.001
while(_running):
if direction:
q = q + 0.0025
if (q > np.pi/2).any():
direction = False
else:
q = q - 0.0025
if (q < -np.pi/2).any():
direction = True
master_board.track_reference(q, q_prime)
#Terminate if duration has passed
if (time.perf_counter() - program_start > duration):
_running = False
cycle_time = time.perf_counter() - cycle_start
time.sleep(dt - cycle_time)
cycle_start = time.perf_counter()
timestep_end = time.perf_counter()
time_per_timestep_array.append(timestep_end - timestep_start)
timestep_start = time.perf_counter()
We suspect the issue has to do with the way we define the sleep amount. Cycle_time is meant to be the time that the calculations above time.sleep() takes, so that: sleep time + cycle time = 1ms. However, we are not sure how to properly do this, and we're struggling with finding resources on the subject.
How should one properly define a task such as this for a real time application?
We have quite loose requirements (several milliseconds), but it is very important to us that it is deterministic, as this is part of our thesis and we need to understand what is going on.
Any answers to our question or relevant resources are greatly appreciated.
Link to the full code: https://drive.google.com/drive/folders/12KE0EBaLc2rkTZK2FuX_goMF4MgWtknS?usp=sharing

timestep_end = time.perf_counter()
time_per_timestep_array.append(timestep_end - timestep_start)
timestep_start = time.perf_counter()
You're recording the time between timestep_start from the previous cycle and timestep_end from the current cycle. This interval does not accurately represent the cycle time step (even if we assume that no task preemption takes place); it excludes the time consumed by the array append function. Since the outlier seems to happen at the same exact time mark every time we run the script, we could suspect that at this point the array exceeds a certain size where an expensive memory reallocation has to take place. Regardless of the real reason, you should remove such timing inaccuracies by recording the time between cycle starts:
timestep_end = cycle_start
time_per_timestep_array.append(timestep_end - timestep_start)
timestep_start = cycle_start

Does python timeit consider setup in the count

I'm using python timeit to see how long it takes for a function to run.
setup = '''from __main__ import my_func
import random
l = list(range(10000))
random.shuffle(l)'''
timeit.timeit('my_func(l)', setup=setup, number=1000)
the results I'm getting are bigger than a 'normal' check with datetime.
Does timeit also count the time the setup takes, and if so - how can I disable it?

Does my_func(l) mutate l? That could affect the timings.
timeit will run the setup once and reuse the objects created by the setup each time it calls the code that is to be timed. Also it can call the code a few times to gauge roughly how fast it runs and choose the number of iterations before the actual timed run (though not when you've specified the number of runs yourself). That would mean if there is an initial fast run it won't be included in the timed results.
For example if my_func() was a badly written quicksort function it might run quickly when you call it on a shuffled list and very very slowly when you call it again with a (now sorted) list. timeit would only measure the very slow calls.

The docs say:
The execution time of setup is excluded from the overall timed
execution run.

The Python 2.0 docs are pretty clear that the setup statement is not timed:
Time number executions of the main statement. This executes the setup
statement once, and then returns the time it takes to execute the main
statement a number of times, measured in seconds as a float.
But if you're not sure, put a big, slow process into the setup statement and test to see what difference it makes.

Timing Code Execution Time

So, I am interested in timing some of the code I am setting up. Borrowing a timer function from the 4th edition of Learning Python, I tried:
import time
reps = 100
repslist = range(reps)
def timer(func):
start = time.clock()
for i in repslist:
ret = func()
elasped = time.clock()-start
return elapsed
Then, I paste in whatever I want to time, and put:
print(timer(func)) #replace func with the function you want to time
When I run it on my code, I do get an answer, but it's nonsense. Suspecting something was wrong, I put a time.sleep(0.1) call in my code, and got a result of 0.8231
Does anybody know why this might be the case or how to fix it? I suspect that the time.clock() call might be at fault.

According to the help docs for clock:
Return the CPU time or real time since the start of the process or since the first call to clock(). This has as much precision as the system records.
The second call to clock already returns the elapsed time between it and the first clock call. You don't need to manually subtract start.
Change
elasped = time.clock()-start
to
elasped = time.clock()

If you want to timer a function perhaps give decorators a try(documentation here):
import time
def timeit(f):
def timed(*args, **kw):
ts = time.time()
result = f(*args, **kw)
te = time.time()
print 'func:%r args:[%r, %r] took: %2.4f sec' % \
(f.__name__, args, kw, te-ts)
return result
return timed
Then when you write a function you just use the decorator, here:
#timeit
def my_example_function():
for i in range(10000):
print "x"
This will print out the time the function took to execute:
func:'my_example_function' args:[(), {}] took: 0.4220 sec

After fixing the typo in the first intended use of elapsed, your code works fine with either time.clock or time.time (or Py3's time.monotonic for that matter) on my Linux system.
The difference would be in the (OS specific) behavior for clock; on most UNIX-like OSes it will return the processor time used by the program since it launched (so time spent blocked, on I/O, locks, page faults, etc. wouldn't count), while on Windows it's a wall clock timer (so time spent blocked would count) that counts seconds since first call.
The UNIX-like version of time.clock is also fairly unreliable if used in a long running program when clock_t is only 32 bits; the value it returns will wrap roughly every 72 minutes of processor time.
Of course, time.time isn't perfect either; it follows the system clock, so an NTP time update (or any other change to the system clock) occurring between calls will give erroneous results (on Python 3.3+, you'd use time.monotonic to avoid this problem). It's also not guaranteed to have granularity finer than 1 second, so if your function doesn't take an awfully long time to run, on a system with low res time.time you won't get particularly useful results.
Really, you should be looking at the Python batteries designed for this (that also handle issues like garbage collection overhead and the like). The timeit module already has a function that does what you want, but handles all the edge cases and issues I mentioned. For example, to time some global function named foo for 100 reps, you'd just do:
import timeit
def foo():
...
print(timeit.timeit('foo()', 'from __main__ import foo', number=100))
It fixes most of the issues I mention by selecting the best timing function for the OS you're on (and also fixes other sources of jitter, e.g. cyclic garbage collection, which is disabled during the test and reenabled at the end).
Even if you don't want to use that for some reason, if you're using Python 3.3 or higher, take a look at the replacements for time.clock, e.g. time.perf_counter (includes time spent sleeping) or time.process_time (includes only CPU time), both of which are portable, reliable, fast, and high resolution for better accuracy.

The time.sleep() will terminate for any signal. read about it here ...
http://www.tutorialspoint.com/python/time_sleep.htm

Square waveform using Python and pyparallel

I want to generate square clock waveform to external device.
I use python 2.7 with Windows 7 32bit on an old PC with a LPT1 port.
The code is simple:
import parallel
import time
p = parallel.Parallel() # open LPT1
x=0
while (x==0):
p.setData(0xFF)
time.sleep(0.0005)
p.setData(0x00)
I do see the square wave (using scope) but with not expected time period.
I will be gratefull for any help

It gives an expected performance for a while... Continue to reduce times
import parallel
import time
x=0
while (x<2000):
p = parallel.Parallel()
time.sleep(0.01) # open LPT1
p.setData(0xFF)
p = parallel.Parallel() # open LPT1
time.sleep(0.01)
p.setData(0x00)
x=x+1

To generate signals like that is hard. To mention one reason why it is hard might be that the process gets interrupted returns when the sleep time is exceeded.
Found this post about sleep precision with an accepted answer that is great:
How accurate is python's time.sleep()?
another source of information: http://www.pythoncentral.io/pythons-time-sleep-pause-wait-sleep-stop-your-code/
What the information tells you is that Windows will be able to do a sleep for a minimum ~10ms, in Linux the time is approximately 1ms, but may vary.
Update
I made function that make possible to sleep less then 10ms. But the precision is very sketchy.
In the attached code I included a test that presents how the precision behaves. If you want higher precision, I strongly recommend you read the links I attached in my original answer.
from time import time, sleep
import timeit
def timer_sleep(duration):
""" timer_sleep() sleeps for a given duration in seconds
"""
stop_time = time() + duration
while (time() - stop_time) < 0:
# throw in something that will take a little time to process.
# According to measurements from the comments, it will take aprox
# 2useconds to handle this one.
sleep(0)
if __name__ == "__main__":
for u_time in range(1, 100):
u_constant = 1000000.0
duration = u_time / u_constant
result = timeit.timeit(stmt='timer_sleep({time})'.format(time=duration),
setup="from __main__ import timer_sleep",
number=1)
print('===== RUN # {nr} ====='.format(nr=u_time))
print('Returns after \t{time:.10f} seconds'.format(time=result))
print('It should take\t{time:.10f} seconds'.format(time=duration))
Happy hacking

why execution time for this python code increases each call?

import time
word = {"success":0, "desire":0, "effort":0, ...}
def cleaner(x):
dust = ",./<>?;''[]{}\=+_)(*&^%$##!`~"
for letter in x:
if letter in dust:
x = x[0:x.index(letter)]+x[x.index(letter)+1:]
else:
pass
return x #alhamdlillah it worked 31.07.12
print "input text to analyze"
itext = cleaner(raw_input()).split()
t = time.clock()
for iword in itext:
if iword in word:
word[iword] += 1
else:
pass
print t
print len(itext)
every time i call the code, t will increase. can anyone explain the underlying concept/reason behind this. perhaps in terms of system process? thank you very much, programming lads.

Because you're printing out the current time each time you run the script
That's how time works, it advances, constantly.

If you want to measure the time taken for your for loop (between the first call to time.clock() and the end), print out the difference in times:
print time.clock() - t

You are printing the current time... of course it increases every time you run the code.
From the python documentation for time.clock():
On Unix, return the current processor time as a floating point number
expressed in seconds. The precision, and in fact the very definition
of the meaning of “processor time”, depends on that of the C function
of the same name, but in any case, this is the function to use for
benchmarking Python or timing algorithms.
On Windows, this function returns wall-clock seconds elapsed since the
first call to this function, as a floating point number, based on the
Win32 function QueryPerformanceCounter(). The resolution is typically
better than one microsecond.

time.clock() returns the elapsed CPU time since the process was created. CPU time is based on how many cycles the CPU spent in the context of the process. It is a monotonic function during the lifetime of a process, i.e. if you call time.clock() several times in the same execution, you will get a list of increasing numbers. The difference between two successive invocations of clock() could be less than the elasped wall-clock time or more, depending on wheather the CPU was not running at 100% (e.g. there was some waiting for I/O) or if you have a multithreaded program which consumes more than 100% of CPU time (e.g. multicore CPU with 2 threads using 75% each -> you'd get 150% of the wall-clock time). But if you call clock() once in one process, then rerun the program again, you might get lower value than the one before, if it takes less time to process the input in the new process.
What you should be doing instead is to use time.time() which returns the current Unix timestamp with fractional (subsecond) precision. You should call it once before the processing is started and once after that and subtract the two values in order to compute the wall-clock time elapsed between the two invocations.
Note that on Windows time.clock() returns the elapsed wall-clock time since the process was started. It is like calling time.time() immediately at the beginning of the script and then subtracting the value from later calls to time.time().

There is a really good library called jackedCodeTimerPy that works better than the time module. It also has some clever error checking so you may want to try it out.
Using jackedCodeTimerPy your code should look like this:
# import time
from jackedCodeTimerPY import JackedTiming
JTimer = JackedTiming()
word = {"success":0, "desire":0, "effort":0}
def cleaner(x):
dust = ",./<>?;''[]{}\=+_)(*&^%$##!`~"
for letter in x:
if letter in dust:
x = x[0:x.index(letter)]+x[x.index(letter)+1:]
else:
pass
return x #alhamdlillah it worked 31.07.12
print "input text to analyze"
itext = cleaner(raw_input()).split()
# t = time.clock()
JTimer.start('timer_1')
for iword in itext:
if iword in word:
word[iword] += 1
else:
pass
# print t
JTimer.stop('timer_1')
print JTimer.report()
print len(itext)
It gives really good reports like
label min max mean total run count
------- ----------- ----------- ----------- ----------- -----------
imports 0.00283813 0.00283813 0.00283813 0.00283813 1
loop 5.96046e-06 1.50204e-05 6.71864e-06 0.000335932 50
I like how it gives you statistics on it and the number of times the timer is run.
It's simple to use. If i want to measure the time code takes in a for loop i just do the following:
from jackedCodeTimerPY import JackedTiming
JTimer = JackedTiming()
for i in range(50):
JTimer.start('loop') # 'loop' is the name of the timer
doSomethingHere = 'This is really useful!'
JTimer.stop('loop')
print(JTimer.report()) # prints the timing report
You can can also have multiple timers running at the same time.
JTimer.start('first timer')
JTimer.start('second timer')
do_something = 'amazing'
JTimer.stop('first timer')
do_something = 'else'
JTimer.stop('second timer')
print(JTimer.report()) # prints the timing report
There are more use example in the repo. Hope this helps.
https://github.com/BebeSparkelSparkel/jackedCodeTimerPY

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

avoid expensive setup in timeit.repeat() benchmark - python

Related

How to schedule real time cyclic task?

Does python timeit consider setup in the count

Timing Code Execution Time

Square waveform using Python and pyparallel

why execution time for this python code increases each call?

Categories

Resources