Compute random number over certain time interval with Python

Compute random number over certain time interval with Python - python

I did some research before posting but seem to be at a lost (not too experienced in coding).
I am attempting to generate or compute a random number for certain time interval with Python. I'm not looking for full code, I want help using the time library if that is the correct one to use.
Pseudo-code:
Allow python [PC] to compute a random number for 3 seconds
------> Store the computed generation in a value (i can handle this)
I would then use the random generated value to link access a python list (which would be automatically generated via a random number generation as well but i can figure that out).

I'm not sure why you want to do this, but here's how to compute many random numbers, throwing most of them away, and then using the last one after 3 seconds have elapsed.
import random
import time
start = time.clock()
while time.clock() - start < 3:
random_number = random.randint(0,100)
print random_number
This pointlessly throws away about 2 million perfectly good random numbers on my machine.
(And, as abarnert points out, this also maxes out one CPU core for the whole 3 seconds in a busy loop, which is very, very wasteful, but I thinks it's what you were asking for?)
EDIT: Updated to use time.clock instead of time.time, as suggested by abarnert again (thanks), because this seems to give better resolution across platforms and doesn't suffer from problems when the system time is altered in the middle of the program running.

First, you didn't say what kind of random number you want to generate, but given that your example is 10, I assume it's an integer in some range—let's say you're calling random.randrange(30).
Now, you want to compute a number every second for 3 seconds, then keep the last one. I don't know why you'd even want to do this, but you can do it like this:
for i in range(3):
number = random.randrange(30)
time.sleep(1.0)
At the end of 3 seconds, number will be the third random number generated.
The key here is that, to do something once per second (in a synchronous program—don't do this in a GUI or server!)—you just call time.sleep.
If the operation you were doing took a significant chunk of a second (or longer), this wouldn't be appropriate. Instead, you'd want to compute the start time, and sleep until a second after that:
t0 = time.monotonic()
for i in range(3):
number = random.randrange(30)
t0 += 1
time.sleep(t0 - time.monotonic())
Note that I've used time.monotonic here. This function is specifically designed for this kind of use case. It returns as much precision as can be gotten with reasonable efficiency (in particular, unlike time.time, it doesn't give you 1s precision on some platforms), and it guarantees that you'll never go backward even if, e.g., you change the system clock in the middle of the program. If you're using 3.2 or earlier, either look through the docs for the best alternative (possibly using time.clock()), or look into using ctypes to call the appropriate platform native function.
But in this case, random.randrange is going to take somewhere on the order of a microsecond, which is so much less time than the minimum resolution of most systems' simple timers that there's no reason to do such a thing.

If you want to take 3 seconds to get a random number, because you're concerned about the quality of the random number, you can use os.urandom() to generate the value. If all you really want to do is to select an item from your list at random, you can use random.choice()

Note: The function time.clock() has been removed, after having been deprecated since Python 3.3: use time.perf_counter() or time.process_time() instead, depending on your requirements, to have well-defined behavior. (Contributed by Matthias Bussonnier in bpo-36895.)

Related

Iterate through a loop at timed intervals - Python 3.0

I thought I'd sign up to help me learn more coding with Python. I recently signed up to code academy and have been doing the online Python course which has helped a lot.
I've decided to give myself a small project to continue learning but have ran into a problem (searched on here and still no help.)
I'm wanting to write a small function of code for a midi step sequencer, for simplicity I'm omitting midi for now and looking at it in the most logical way I can.
What I want to do is:
input a set of midi note numbers
append these to a list
loop through this list at a timed interval (BPM) - for example 60,000 / 120 bpm = 500ms between quarter notes / 24 PPQN = 20.8333ms per pulse.
The trouble I have is I can't find any possible way to iterate through a list in the time domain. I have looked at the time.sleep function but read this is not accurate enough. Is there another method. I don't want to use any libraries.
Any pointers would be a huge help as I'm struggling to find any available resources for running through a loop at a specified amount of time between each value in the loop.

Could you say why sleep is not accurate enough?
If you wish so, you can keep track of the elapsed time yourself using something like time.thread_time_ns
so:
def sleep(pause_time):
initial_time = time.thread_time_ns()
while( time.thread_time_ns() - initial_time < pause_time):
pass
So this is your own sleep function

The reason time.sleep might not be accurate for you might be due to the way you are using it. Try this:
import time
sleeptime = 0.0208333 #in seconds
for your_loop_here:
start = time.time()
#do stuff here
time.sleep(max(sleeptime - (time.time() - start), 0))
I have used this method to limit frame rate in computer vision processing. All it does is account the loop iteration time in the sleep so that the time for each loop is as accurate as possible. It might work for you too. Hope this helps!
Well of course there's a much more accurate way to do this, which would be to write all your code in assembly and finely adjust the clock speed of your CPU so that each iteration takes a fixed amount of time, but this might be too impractical for your use case.

Why the executing time of functions not constant?

I read my university class theoretically the order of growth of functions and tried implementing it practically at home. Although the order of growth turned out to be exact the same as in textbooks but their executing time changes with every single time I execute the program. Why is that?
Source Code
import time
import math
from tabulate import tabulate
n=eval(input("Enter the value of n: "));
t1=time.time()
a=12
t2=time.time()
A=t2-t1
t3=time.time()
b=n
t4=time.time()
B=t4-t3
t5=time.time()
c=math.log10(n);
t6=time.time()
C=t6-t5
t7=time.time()
d=n*math.log10(n);
t8=time.time()
D=t8-t7
t9=time.time()
e=n**2
t10=time.time()
E=t10-t9
t11=time.time()
f=2**n
t12=time.time()
F=t12-t11
print(tabulate([['constant',a,A], ['n',b,B], ['logn',c,C], ['nlogn',d,D], ['n**2',e,E], ['2**n',f,F]], headers=['Function', 'Value', 'Time']))
templist= [A,B,C,D,E,F]
print("The time order in acsending order is: ", sorted(templist,key=int))
First Execution
naufil#naufil-Inspiron-7559:~/Desktop/python$ python3 time_order.py
Enter the value of n: 100
Function Value Time
---------- --------------- -----------
constant 12 2.14577e-06
n 100 1.43051e-06
logn 2 4.1008e-05
nlogn 200 3.57628e-06
n**2 10000 3.33786e-06
2**n 1.26765e+30 3.8147e-06
The time order in acsending order is: [2.1457672119140625e-06, 1.430511474609375e-06, 4.100799560546875e-05, 3.5762786865234375e-06, 3.337860107421875e-06, 3.814697265625e-06]
Second Execution
naufil#naufil-Inspiron-7559:~/Desktop/python$ python3 time_order.py
Enter the value of n: 100
Function Value Time
---------- --------------- -----------
constant 12 2.14577e-06
n 100 1.19209e-06
logn 2 4.64916e-05
nlogn 200 4.05312e-06
n**2 10000 3.33786e-06
2**n 1.26765e+30 3.57628e-06
The time order in acsending order is: [2.1457672119140625e-06, 1.1920928955078125e-06, 4.649162292480469e-05, 4.0531158447265625e-06, 3.337860107421875e-06, 3.5762786865234375e-06]

As other comments and answers have rightly pointed out, the reason for the difference in execution times that you observe come from the way operating systems work. But doing rigorous measures is a complicated matter, so let me elaborate a bit more though and give you pointers to where you should maybe direct your experimentation.
What your OS does behind your back
You can see the OS as a conductor and programs as instrument players, and imagine there are only so many instruments that can play at the same time. The conductor must therefore choose at each time who should play, also making sure nobody is frustrated in the end! Same-wise, the OS is therefore constantly in charge of choosing what programs to execute, meaning what program to dedicate CPU time. The number of programs (or rather processes) that can be executed at the same time is usually limited by the number of cores in your processor.
In practice, the way that OS chooses what to execute is a very complex and fascinating subject, which relies on experimentation-backed heuristics. (Read more here). What you have to understand, is that there is hardly any way for you to alter this behavior, and none to guarantee the same execution time between two calls.
Using linux's time command
Calling python's time like you do measures the physical time elapsed between two calls, so because of what we have said, you don't only measure time spent on your program's execution. If you want to have a better a sense of what time the OS actually dedicated to your program, you can use the linux command time. The user time, will give you the actual CPU time dedicated to the execution of your program. Check out this thread for more info. But understand that this time as well is subject to oscillations!
What wisdom are you trying to draw from your measurements?
Finally, you should ask yourself if the exact time is really what you want. Do you care about the value? or do you want to exhibit behaviors?
Usually what is done to measure performances, is averaging the execution times of repeated calls. This way, the effects that pertain to the OS's business should be averaged out. (You can see that as building an unbiased estimator for a random process). From what I understand, you are trying to show difference in execution times for algorithms with different complexity. So the actual execution time is not so relevant, what is, is the relative order. That is why averaging multiple calls will reduce the variance of the observation and you will be able to make stronger statements as to the relative execution times.

You should address this question to your operating system. What else runs on your computer? List the various processes and see how many there are; all it takes is a process or even a context swap to alter your execution time. Among other things, calling time.time can invoke such a switch, as this is a call to a system process.
It also depends on what system support routines are already loaded when you call them -- many of those calls being implicit or secondary. If you need to allocate more memory for a particular instruction because another process took the last of your RAM and then swapped out ... well, you get the idea, I hope.

all (2^m−2)/2 possible ways to partition list

Each sample is an array of features (ints). I need to split my samples into two separate groups by figuring out what the best feature, and the best splitting value for that feature, is. By "best", I mean the split that gives me the greatest entropy difference between the pre-split set and the weighted average of the entropy values on the left and right sides. I need to try all (2^m−2)/2 possible ways to partition these items into two nonempty lists (where m is the number of distinct values (all samples with the same value for that feature are moved together as a group))
The following is extremely slow so I need a more reasonable/ faster way of doing this.
sorted_by_feature is a list of (feature_value, 0_or_1) tuples.
same_vals = {}
for ele in sorted_by_feature:
if ele[0] not in same_vals:
same_vals[ele[0]] = [ele]
else:
same_vals[ele[0]].append(ele)
l = same_vals.keys()
orderings = list(itertools.permutations(l))
for ordering in orderings:
list_tups = []
for dic_key in ordering:
list_tups += same_vals[dic_key]
left_1 = 0
left_0 = 0
right_1 = num_one
right_0 = num_zero
for index, tup in enumerate(list_tups):
#0's or #1's on the left +/- 1
calculate entropy on left/ right, calculate entropy drop, etc.
Trivial details (continuing the code above):
if index == len(sorted_by_feature) -1:
break
if tup[1] == 1:
left_1 += 1
right_1 -= 1
if tup[1] == 0:
left_0 += 1
right_0 -= 1
#only calculate entropy if values to left and right of split are different
if list_tups[index][0] != list_tups[index+1][0]:

tl;dr
You're asking for a miracle. No programming language can help you out of this one. Use better approaches than what you're considering doing!
Your Solution has Exponential Time Complexity
Let's assume a perfect algorithm: one that can give you a new partition in constant O(1) time. In other words, no matter what the input, a new partition can be generated in a guaranteed constant amount of time.
Let's in fact go one step further and assume that your algorithm is only CPU-bound and is operating under ideal conditions. Under ideal circumstances, a high-end CPU can process upwards of 100 billion instructions per second. Since this algorithm takes O(1) time, we'll say, oh, that every new partition is generated in a hundred billionth of a second. So far so good?
Now you want this to perform well. You say you want this to be able to handle an input of size m. You know that that means you need about pow(2,m) iterations of your algorithm - that's the number of partitions you need to generate, and since generating each algorithm takes a finite amount of time O(1), the total time is just pow(2,m) times O(1). Let's take a quick look at the numbers here:
m = 20 means your time taken is pow(2,20)*10^-11 seconds = 0.00001 seconds. Not bad.
m = 40 means your time taken is pow(2,40)10-11 seconds = 1 trillion/100 billion = 10 seconds. Also not bad, but note how small m = 40 is. In the vast panopticon of numbers, 40 is nothing. And remember we're assuming ideal conditions.
m = 100 means 10^41 seconds! What happened?
You're a victim of algorithmic theory. Simply put, a solution that has exponential time complexity - any solution that takes 2^m time to complete - cannot be sped up by better programming. Generating or producing pow(2,m) outputs is always going to take you the same proportion of time.
Note further that 100 billion instructions/sec is an ideal for high-end desktop computers - your CPU also has to worry about processes other than this program you're running, in which case kernel interrupts and context switches eat into processing time (especially when you're running a few thousand system processes, which you no doubt are). Your CPU also has to read and write from disk, which is I/O bound and takes a lot longer than you think. Interpreted languages like Python also eat into processing time since each line is dynamically converted to bytecode, forcing additional resources to be devoted to that. You can benchmark your code right now and I can pretty much guarantee your numbers will be way higher than the simplistic calculations I provide above. Even worse: storing 2^40 permutations requires 1000 GBs of memory. Do you have that much to spare? :)
Switching to a lower-level language, using generators, etc. is all a pointless affair: they're not the main bottleneck, which is simply the large and unreasonable time complexity of your brute force approach of generating all partitions.
What You Can Do Instead
Use a better algorithm. Generating pow(2,m) partitions and investigating all of them is an unrealistic ambition. You want, instead, to consider a dynamic programming approach. Instead of walking through the entire space of possible partitions, you want to only consider walking through a reduced space of optimal solutions only. That is what dynamic programming does for you. An example of it at work in a problem similar to this one: unique integer partitioning.
Dynamic programming problems approaches work best on problems that can be formulated as linearized directed acyclic graphs (Google it if not sure what I mean!).
If a dynamic approach is out, consider investing in parallel processing with a GPU instead. Your computer already has a GPU - it's what your system uses to render graphics - and GPUs are built to be able to perform large numbers of calculations in parallel. A parallel calculation is one in which different workers can do different parts of the same calculation at the same time - the net result can then be joined back together at the end. If you can figure out a way to break this into a series of parallel calculations - and I think there is good reason to suggest you can - there are good tools for GPU interfacing in Python.
Other Tips
Be very explicit on what you mean by best. If you can provide more information on what best means, we folks on Stack Overflow might be of more assistance and write such an algorithm for you.
Using a bare-metal compiled language might help reduce the amount of real time your solution takes in ordinary situations, but the difference in this case is going to be marginal. Compiled languages are useful when you have to do operations like searching through an array efficiently, since there's no instruction-compilation overhead at each iteration. They're not all that more useful when it comes to generating new partitions, because that's not something that removing the dynamic bytecode generation barrier actually affects.

A couple of minor improvements I can see:
Use try/catch instead of if not in to avoid double lookup of keys
if ele[0] not in same_vals:
same_vals[ele[0]] = [ele]
else:
same_vals[ele[0]].append(ele)
# Should be changed to
try:
same_vals[ele[0]].append(ele) # Most of the time this will work
catch KeyError:
same_vals[ele[0]] = [ele]
Dont explicitly convert a generator to a list if you dont have to. I dont immediately see any need for your casting to a list, which would slow things down
orderings = list(itertools.permutations(l))
for ordering in orderings:
# Should be changed to
for ordering in itertools.permutations(l):
But, like I said, these are only minor improvements. If you really needed this to be faster, consider using a different language.

What is pseudo random?

I was reading the docs for the random module and noticed it said pseudo random and thought doesnt pseudo mean False so i was wondering what it means when it says that
For Example:
import random
print random.randint(1,2)
print random.randint(1,3)
does this still mean that the first print statement has a 50% chance of printing 1 and a 50% chance of printing 2
and that the second print statement has a 33% chance of printing one and a 33% chance of printing 2 etc.
if not then how are the pseudo random numbers generated ?

To produce true randomness requires specialized hardware that measures random events, such as radioactive decay (random) or brownian motion (also essentially random). Most computers obviously don't have these, so instead you have to use a really complex, evenly distributed, hard to predict 'pseudorandom' algorithm that starts with a number determined by, for example, the current timestamp. Such algorithms are plenty good enough for standard use cases needing 'randomness' as long as you're careful to not seed two random number generators with the same timestamp (start them at the same time on different threads, for example), which will make them do identical things. A common example of such a random number generator is Mersenne Twister: http://en.wikipedia.org/wiki/Mersenne_twister
A site that offers truly random values, explains a lot about randomness and pseudorandomness and has some yummy statistics about its randomness: http://www.random.org/ (see Learn More and Statistics) (It actually seems that it relies on measuring tiny fluctuations in a chaotic system, e.g. atmospheric noise, but the statistics show that it is so much like true randomness you can't tell it apart!)

Binary search over an "infinite" sequence. Where do I start?

I have an interesting problem. I'm faced with a function that takes a long time to compute a value based on some index. Call it takes_a_long_time(index). The values returned from this function are guaranteed to have a global minimum, but there are no guarantees that the index associated with will be close to zero.
Since takes_a_long_time takes arbitrarily large positive integers as its index, There are unique constraints on how to begin the binary search. I need a way to create a finite interval to search in for the exact minimum. My first thought was to check increasingly large intervals starting from zero. Something like:
def find_interval_with_minimum():
start = 0
end = 1
interval_size = 1
minimum_in_interval = check_minimum_in(start, end)
while not minimum_in_interval:
interval_size = interval_size * 2
start = end
end = start + interval_size
minimum_in_interval = check_minimum_in(start, end)
return start, end
This would seem to work fine, but there is an additional detail that really throws things off. takes_a_long_time requires exponentially more time to compute a value as indexes approach zero. Since check_minimum_in would require multiple calls to takes_a_long_time, I would like to avoid starting at zero.
So my question is, given that the minimum could be anywhere on [0, +infinity), is there any reasonable way to run this "backwards?" Or, is there some good heuristic to use to avoid checking low indices if not necessary?
I'd love a language agnostic solution. However, I am writing this in Python, so if there is a python specific approach to this, I'd take that as well.

From the comments to the question, the curve is well-behaved and you could use something like ternary search. The only problem then is how to handle the inconvenient behavior as your approach zero. So don't start at zero: define a new function g from your function f with g(x) = f(1/x). Search this starting from x=0 and a small value, doubling or otherwise increasing the interval size until it contains the minimum.
To do this, you need to know the limit of f as its argument approaches infinity, or the equivalent limit of g as its argument goes to zero. If it can't be evaluated explicitly, I'd try a numerical approximation.
See the comments to the answer for some points to consider in how you increase the interval size, especially that by Steve Jessop.

Sounds like the thing to do is to pick a large number, big enough that takes_a_long_time doesn't take too long to be acceptable. Start two threads: one which starts looking up towards positive infinity for a range containing the minimum, and another one which starts looking down towards zero for a range containing the minimum. Because of the exponential time increase, 0 might as well be at infinity as far as searching is concerned. Whichever thread finds a result, cancel the other one.
But then, unless you want to take advantage of multiple CPU cores don't start two threads (and if you do, don't start exactly two threads, start one per core or so). Just alternate doing work on on side or the other.
Given this basic strategy, now you need to tune the rate at which you approach 0. The faster you approach it, the fewer steps to find the minimum if it's really on that side, but the bigger the range left to be binary searched, because on average you'll "overshoot" further towards zero. If the performance curve is reciprocal-exponential, then presumably you want to overshoot as little as possible, so should approach 0 very slowly. It might even be that your task is computationally infeasible, "exponential" often means "impossible".
Obviously I can't say anything about what the initial "large number" should be. Is a hundred tolerable? Is a million? Graham's number? If you don't even know what's likely to have acceptable performance, you could find out by running in parallel (again, either via threads or via dovetailing) a set of calculations of takes_a_long_time for different indexes until one of them completes. Again, there's no guarantee that this is computationally feasible - if every single index that fits in the memory of your computer takes at least a billion years, you're stuck in practice even though you have a solution in theory.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.