Why the executing time of functions not constant?

Why the executing time of functions not constant? - python

I read my university class theoretically the order of growth of functions and tried implementing it practically at home. Although the order of growth turned out to be exact the same as in textbooks but their executing time changes with every single time I execute the program. Why is that?
Source Code
import time
import math
from tabulate import tabulate
n=eval(input("Enter the value of n: "));
t1=time.time()
a=12
t2=time.time()
A=t2-t1
t3=time.time()
b=n
t4=time.time()
B=t4-t3
t5=time.time()
c=math.log10(n);
t6=time.time()
C=t6-t5
t7=time.time()
d=n*math.log10(n);
t8=time.time()
D=t8-t7
t9=time.time()
e=n**2
t10=time.time()
E=t10-t9
t11=time.time()
f=2**n
t12=time.time()
F=t12-t11
print(tabulate([['constant',a,A], ['n',b,B], ['logn',c,C], ['nlogn',d,D], ['n**2',e,E], ['2**n',f,F]], headers=['Function', 'Value', 'Time']))
templist= [A,B,C,D,E,F]
print("The time order in acsending order is: ", sorted(templist,key=int))
First Execution
naufil#naufil-Inspiron-7559:~/Desktop/python$ python3 time_order.py
Enter the value of n: 100
Function Value Time
---------- --------------- -----------
constant 12 2.14577e-06
n 100 1.43051e-06
logn 2 4.1008e-05
nlogn 200 3.57628e-06
n**2 10000 3.33786e-06
2**n 1.26765e+30 3.8147e-06
The time order in acsending order is: [2.1457672119140625e-06, 1.430511474609375e-06, 4.100799560546875e-05, 3.5762786865234375e-06, 3.337860107421875e-06, 3.814697265625e-06]
Second Execution
naufil#naufil-Inspiron-7559:~/Desktop/python$ python3 time_order.py
Enter the value of n: 100
Function Value Time
---------- --------------- -----------
constant 12 2.14577e-06
n 100 1.19209e-06
logn 2 4.64916e-05
nlogn 200 4.05312e-06
n**2 10000 3.33786e-06
2**n 1.26765e+30 3.57628e-06
The time order in acsending order is: [2.1457672119140625e-06, 1.1920928955078125e-06, 4.649162292480469e-05, 4.0531158447265625e-06, 3.337860107421875e-06, 3.5762786865234375e-06]

As other comments and answers have rightly pointed out, the reason for the difference in execution times that you observe come from the way operating systems work. But doing rigorous measures is a complicated matter, so let me elaborate a bit more though and give you pointers to where you should maybe direct your experimentation.
What your OS does behind your back
You can see the OS as a conductor and programs as instrument players, and imagine there are only so many instruments that can play at the same time. The conductor must therefore choose at each time who should play, also making sure nobody is frustrated in the end! Same-wise, the OS is therefore constantly in charge of choosing what programs to execute, meaning what program to dedicate CPU time. The number of programs (or rather processes) that can be executed at the same time is usually limited by the number of cores in your processor.
In practice, the way that OS chooses what to execute is a very complex and fascinating subject, which relies on experimentation-backed heuristics. (Read more here). What you have to understand, is that there is hardly any way for you to alter this behavior, and none to guarantee the same execution time between two calls.
Using linux's time command
Calling python's time like you do measures the physical time elapsed between two calls, so because of what we have said, you don't only measure time spent on your program's execution. If you want to have a better a sense of what time the OS actually dedicated to your program, you can use the linux command time. The user time, will give you the actual CPU time dedicated to the execution of your program. Check out this thread for more info. But understand that this time as well is subject to oscillations!
What wisdom are you trying to draw from your measurements?
Finally, you should ask yourself if the exact time is really what you want. Do you care about the value? or do you want to exhibit behaviors?
Usually what is done to measure performances, is averaging the execution times of repeated calls. This way, the effects that pertain to the OS's business should be averaged out. (You can see that as building an unbiased estimator for a random process). From what I understand, you are trying to show difference in execution times for algorithms with different complexity. So the actual execution time is not so relevant, what is, is the relative order. That is why averaging multiple calls will reduce the variance of the observation and you will be able to make stronger statements as to the relative execution times.

You should address this question to your operating system. What else runs on your computer? List the various processes and see how many there are; all it takes is a process or even a context swap to alter your execution time. Among other things, calling time.time can invoke such a switch, as this is a call to a system process.
It also depends on what system support routines are already loaded when you call them -- many of those calls being implicit or secondary. If you need to allocate more memory for a particular instruction because another process took the last of your RAM and then swapped out ... well, you get the idea, I hope.

Related

CPLEX: How to get the real running time in a deterministic way? (Python)

I solved an MIP problem, with making the solving process sleep for 1 second every branch using BranchCallback (single thread). I noticed from the log that the system time measured in seconds changed every run, while the deterministic time measured in ticks didn't. However, the problem was that the latter didn't even change whether the 1-second sleep was applied or not. On the contrary, The system time did record the sleep time.
I also tried to get the deterministic time using the callback api, but it only counted 0.0 ticks for the 1-second sleep. It's not a problem about the sleep mode, because a simple piece of code counting for a large number also showed 0.0 ticks. I thought it might not record the code running time.
What exactly does the determministic time measure in CPLEX? Is there any method to measure the real running time (especially the real callback running time) as the system time did, but in a deterministic way?

The deterministic time is an approximation for the work that CPLEX does (you can think of it as number of instructions executed inside CPLEX). Doing nothing does not execute any instructions so does not count towards deterministic time.
Moreover, deterministic time is only measured inside CPLEX. It does not account for time spent in user code like callbacks.
If you want to measure the time spent in your callback then you have to do that yourself (there is no point for CPLEX to track this): just take a time stamp at the beginning of your callback, one at the end of your callback and then compute the difference. The CPLEX callbacks have functions to take time stamps, see the reference documentation.
In case you want to have a determinstic time for code you wrote then you have to roll your own and first of all define what deterministic time means for your code.

Highly variable execution times in Cython functions

I have a performance measurement issue while executing a migration to Cython from C-compiled functions (through scipy.weave) called from a Python engine.
The new cython functions profiled end-to-end with cProfile (if not necessary I won't deep down in cython profiling) record cumulative measurement times highly variable.
Eg. the cumulate time of a cython function executed 9 times per 5 repetitions (after a warm-up of 5 executions - not took in consideration by the profiling function) is taking:
in a first round 215,627339 seconds
in a second round 235,336131 seconds
Each execution calls the functions many times with different, but fixed parameters.
Maybe this variability could depends on CPU loads of the test machine (a cloud-hosted dedicated one), but I wonder if such a variability (almost 10%) could depend someway by cython or lack of optimization (I already use hints on division, bounds check, wrap-around, ...).
Any idea on how to take reliable metrics?

First of all, you need to ensure that your measurement device is capable of measuring what you need: specifically, only the system resources you consume. UNIX's utime is one such command, although even that one still includes swap time. Check the documentation of your profiler: it should have capabilities to measure only the CPU time consumed by the function. If so, then your figures are due to something else.
Once you've controlled the external variations, you need to examine the internal. You've said nothing about the complexion of your function. Some (many?) functions have available short-cuts for data-driven trivialities, such as multiplication by 0 or 1. Some are dependent on an overt or covert iteration that varies with the data. You need to analyze the input data with respect to the algorithm.
One tool you can use is a line-oriented profiler to detail where the variations originate; seeing which lines take the extra time should help determine where the "noise" comes from.

I'm not a performance expert but from my understanding the thing you should be measuring would be the average time it take per execution not the cumulative time? Other than that is your function doing any like reading from disk and/or making network requests?

Wasting cpu cycles with python

I am trying to create a simple app that wastes cpu cycles for multi-core research. The one I created takes up 100% core usage. I want it to be around 30% 60% 70%, which adjustments should I make in order to achieve this? Thanks in advance.
Current version:
a=999999999
while True:
a=a/2

Starting at a large number isn't necessary, as dividing a number by 2 will quickly end up as 0/2 over and over again anyway. Besides, you don't have to actually do anything in a loop to consume CPU cycles - the mere action of looping is enough. This is why any infinite loop, even something as simple as while 1: pass, will eat up an entire CPU core until killed. To avoid taking up an entire core, use time.sleep to pause execution of the thread for a certain period of time. This function takes a single argument representing the time in seconds for the thread to sleep. It accepts a floating-point number.
import time
while 1:
time.sleep(0.0001)
Simply run an instance of this script (with an appropriate sleep time for the workload you'd like to put on your particular system) for each core you'd like to test.
Note that some operating systems may not support sleep times of less than one millisecond, causing shorter sleep times to come through as zero, making them incompatible with this strategy. See Python: high precision time.sleep and How accurate is python's time.sleep()? for more.

How to keep track of execution time?

The Setup
I'm working on training some neural networks. These have lots of hyperparameters, and typically you see how each set of hyperparameters performs, then pick your favorite. This is often done by (say) training a network with the given parameters for n epochs, then evaluate its performance, yielding a numerical score of each set of parameters and allowing you to pick the best.
There's a problem with this, though. Some sets of parameters let you go through more epochs more quickly, but benefit less from each epoch. Additionally, pretty much any set of parameters will always do better, given more epochs, so given infinite time, they would all do really well (to a point, but that's not the point right now).
The Problem
What I would prefer to do is to let each process figure out how long it's been running, and cut itself off (gracefully) after a specified number of seconds. The problem is, I would like to multithread this, so just because the program has been running for 60 seconds doesn't mean the process has had 60 seconds of fair CPU time.
So how can I measure how much time the process has actually had available to it, within the process itself?
The time.clock() method gives system time, which is problematic (as above).
The timeit module seems a bit better, but it's external to the script, so the process wouldn't know when to stop.
Is there a better way? Am I wrong about one of the above ways?
Specific Question
How can a python process see how many seconds it has been allocated so far? Not the amount of time that has passed, but how many seconds it itself has been allowed to execute for?

Use os.times().
This gives you the user and system times for the current process. Below is an example limiting the amount of user time.
start = os.times()
limit = 5 # seconds of user time
while True:
# your code here
check = os.times()
if check.user - start.user > limit:
break

Compute random number over certain time interval with Python

I did some research before posting but seem to be at a lost (not too experienced in coding).
I am attempting to generate or compute a random number for certain time interval with Python. I'm not looking for full code, I want help using the time library if that is the correct one to use.
Pseudo-code:
Allow python [PC] to compute a random number for 3 seconds
------> Store the computed generation in a value (i can handle this)
I would then use the random generated value to link access a python list (which would be automatically generated via a random number generation as well but i can figure that out).

I'm not sure why you want to do this, but here's how to compute many random numbers, throwing most of them away, and then using the last one after 3 seconds have elapsed.
import random
import time
start = time.clock()
while time.clock() - start < 3:
random_number = random.randint(0,100)
print random_number
This pointlessly throws away about 2 million perfectly good random numbers on my machine.
(And, as abarnert points out, this also maxes out one CPU core for the whole 3 seconds in a busy loop, which is very, very wasteful, but I thinks it's what you were asking for?)
EDIT: Updated to use time.clock instead of time.time, as suggested by abarnert again (thanks), because this seems to give better resolution across platforms and doesn't suffer from problems when the system time is altered in the middle of the program running.

First, you didn't say what kind of random number you want to generate, but given that your example is 10, I assume it's an integer in some range—let's say you're calling random.randrange(30).
Now, you want to compute a number every second for 3 seconds, then keep the last one. I don't know why you'd even want to do this, but you can do it like this:
for i in range(3):
number = random.randrange(30)
time.sleep(1.0)
At the end of 3 seconds, number will be the third random number generated.
The key here is that, to do something once per second (in a synchronous program—don't do this in a GUI or server!)—you just call time.sleep.
If the operation you were doing took a significant chunk of a second (or longer), this wouldn't be appropriate. Instead, you'd want to compute the start time, and sleep until a second after that:
t0 = time.monotonic()
for i in range(3):
number = random.randrange(30)
t0 += 1
time.sleep(t0 - time.monotonic())
Note that I've used time.monotonic here. This function is specifically designed for this kind of use case. It returns as much precision as can be gotten with reasonable efficiency (in particular, unlike time.time, it doesn't give you 1s precision on some platforms), and it guarantees that you'll never go backward even if, e.g., you change the system clock in the middle of the program. If you're using 3.2 or earlier, either look through the docs for the best alternative (possibly using time.clock()), or look into using ctypes to call the appropriate platform native function.
But in this case, random.randrange is going to take somewhere on the order of a microsecond, which is so much less time than the minimum resolution of most systems' simple timers that there's no reason to do such a thing.

If you want to take 3 seconds to get a random number, because you're concerned about the quality of the random number, you can use os.urandom() to generate the value. If all you really want to do is to select an item from your list at random, you can use random.choice()

Note: The function time.clock() has been removed, after having been deprecated since Python 3.3: use time.perf_counter() or time.process_time() instead, depending on your requirements, to have well-defined behavior. (Contributed by Matthias Bussonnier in bpo-36895.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.