Binary search over an "infinite" sequence. Where do I start? - python

I have an interesting problem. I'm faced with a function that takes a long time to compute a value based on some index. Call it takes_a_long_time(index). The values returned from this function are guaranteed to have a global minimum, but there are no guarantees that the index associated with will be close to zero.
Since takes_a_long_time takes arbitrarily large positive integers as its index, There are unique constraints on how to begin the binary search. I need a way to create a finite interval to search in for the exact minimum. My first thought was to check increasingly large intervals starting from zero. Something like:
def find_interval_with_minimum():
start = 0
end = 1
interval_size = 1
minimum_in_interval = check_minimum_in(start, end)
while not minimum_in_interval:
interval_size = interval_size * 2
start = end
end = start + interval_size
minimum_in_interval = check_minimum_in(start, end)
return start, end
This would seem to work fine, but there is an additional detail that really throws things off. takes_a_long_time requires exponentially more time to compute a value as indexes approach zero. Since check_minimum_in would require multiple calls to takes_a_long_time, I would like to avoid starting at zero.
So my question is, given that the minimum could be anywhere on [0, +infinity), is there any reasonable way to run this "backwards?" Or, is there some good heuristic to use to avoid checking low indices if not necessary?
I'd love a language agnostic solution. However, I am writing this in Python, so if there is a python specific approach to this, I'd take that as well.

From the comments to the question, the curve is well-behaved and you could use something like ternary search. The only problem then is how to handle the inconvenient behavior as your approach zero. So don't start at zero: define a new function g from your function f with g(x) = f(1/x). Search this starting from x=0 and a small value, doubling or otherwise increasing the interval size until it contains the minimum.
To do this, you need to know the limit of f as its argument approaches infinity, or the equivalent limit of g as its argument goes to zero. If it can't be evaluated explicitly, I'd try a numerical approximation.
See the comments to the answer for some points to consider in how you increase the interval size, especially that by Steve Jessop.

Sounds like the thing to do is to pick a large number, big enough that takes_a_long_time doesn't take too long to be acceptable. Start two threads: one which starts looking up towards positive infinity for a range containing the minimum, and another one which starts looking down towards zero for a range containing the minimum. Because of the exponential time increase, 0 might as well be at infinity as far as searching is concerned. Whichever thread finds a result, cancel the other one.
But then, unless you want to take advantage of multiple CPU cores don't start two threads (and if you do, don't start exactly two threads, start one per core or so). Just alternate doing work on on side or the other.
Given this basic strategy, now you need to tune the rate at which you approach 0. The faster you approach it, the fewer steps to find the minimum if it's really on that side, but the bigger the range left to be binary searched, because on average you'll "overshoot" further towards zero. If the performance curve is reciprocal-exponential, then presumably you want to overshoot as little as possible, so should approach 0 very slowly. It might even be that your task is computationally infeasible, "exponential" often means "impossible".
Obviously I can't say anything about what the initial "large number" should be. Is a hundred tolerable? Is a million? Graham's number? If you don't even know what's likely to have acceptable performance, you could find out by running in parallel (again, either via threads or via dovetailing) a set of calculations of takes_a_long_time for different indexes until one of them completes. Again, there's no guarantee that this is computationally feasible - if every single index that fits in the memory of your computer takes at least a billion years, you're stuck in practice even though you have a solution in theory.

Related

Why do we only consider size of an input when estimating algorithm's complexity?

For the sake of the argument, consider following (very bad) sorting algorithm in python:
def so(ar):
while True:
le = len(ar)
switch = False
for y in range(le):
if y+1 == le:
break
if ar[y] > ar[y+1]:
ar[y],ar[y+1] = ar[y+1],ar[y]
switch = True
if switch == False:
break
return ar
I'm trying to understand the concept of "complexity of the algorithm" and there is one thing I don't get.
I came across the post that explains how to find the complexity of the algorithm here:
You add up how many machine instructions it will execute as a function
of the size of its input, and then simplify the expression to the
largest (when N is very large) term and can include any simplifying
constant factor.
But well, the problem is, that I cannot calculate how many machine instructions will be executed just
by knowing the length of the list.
Consider first example:
li = [random.randint(1,5000) for x in range(3000)]
start = time.time()
so(li)
end = time.time() - start
print(end)
Output: 2.96921706199646
Now have a look at the second example:
ok = [5000,43000,232] + [x for x in range(2997)]
start = time.time()
so(ok)
end = time.time() - start
print(end)
Output: 0.010689020156860352
We can see that the same sorting algorithm, two different lists, lists are the same length, and two completely different execution times.
When people are talking about algorithm complexity (big O notation) they normally assume that the only variable that determines complexity of the algo is the size of the input, but clearly, in the example above it is not the case. It is not only the size of the list, but also the positioning of each value within the list that determines the speed of the sorting.
So my question is, why do we only consider size of input when estimating complexity?
And, if it is possible, can you tell me what the complexity of the algorithm above will be?
You're correct, complexity doesn't only depend on N. That's why you'll often see indications about average, worst and best cases.
Timsort is used in Python because it's (O n log n) on average, still fast for worst-cases (O(n log n)) and extremely fast for best-cases (O(n), when the list is already sorted).
Quicksort also has an average complexity of O(n log n), but its worst case is O(n²), when the list is already sorted. This use case happens very often, so it might be worth it to actually shuffle the list before sorting it!
why do we only consider size of input when estimating complexity?
In the narrow sense of complexity as of the use of Big O notation in computer science, it is simply by definition:
In computer science, big O notation is used to classify algorithms according to how their running time or space requirements grow as the input size grows.
In the broader sense your question could be interpreted as "why do we use Big O notation to describe algorithm complexity when the nature of the data can be just as important as its size."
The answer here lies in the fact that algorithm development is often done on small datasets to make it easy, while in the real world the datasets are huge. When you are writing your sorting function you're most likely going to try it first on small lists of random data. You'd want the result small enough that you can verify that it worked by simply looking at the result...
The time complexity is not always definitely dependent on size of input. When we look at randomized sorting algorithms, the input patterns might play a significant role in determining time complexity.
We usually calculate time complexity in terms of worst, good and average case and could particularly study time complexity in terms of specific input order/patterns which could lead to good, average and best case time complexity.
For example, in first case provided by you, since input is randomized, there is 1/n! probability for a particular input to occur. The good case (when the list is sorted already) is Ω(n) and the worst case(when the list is reversely sorted) is O(n²) , but the probability is low for best or worst case to occur.
Therefore, the sorting algorithm has θ(n²) average time complexity since the probability of comparison and swap in case of two elements in average case input is high due to random distribution of numbers.
In the second case, the order is strict which means high probability for input to tend toward best case or worst case time complexity . In your case, input is more tending towards good case, therefore lesser time.

Method for finding best result from (almost) random data

So I'm working on some calculator for a game I play - for fun, which takes various abilities with different cooldowns, usage times, a percentage in which they may be used at etc ...
So far I am doing this by analyzing numbers in base however many abilities I have, so for example assuming i have 5 abilities used over 4 seconds:
0000: 60 damage (using ability 0, trying to use it again but failing - so returns ability damage of 0)
0001: 60 damage
Skip a few ...
0101: 200 damage
and again ...
4444: 70 damage.
Process terminates. - Hope that made sense.
Problem is, doing this in brute force works well with small times (like above) and number or abilities, however at much higher times and number of abilities it runs analyzing trillions of simulations which means brute force no longer becomes an option.
Question is, considering the data is mostly random, are there any heuristic algorithm's that (all thought may not return the optimal) will return a relatively good result.
Thanks for any responses :)
Let me rephrase to make sure I understand correctly: you want to find the best sequencing of skills, given their individual damage and cooldowns, such that only one skill is used at each time, and no skill is used more often than its cooldown allows. If so, it is a kind of a scheduling problem and one way to approach would be through linear programming.
The rough idea is to introduce n_skills * simulation_length variables x[skill][time], each constrained between 0 and 1, with the interpretation of "use skill skill at time time if x[skill][time] == 1, don't use if == 0". Now you optimize the sum of all variables weighted by the damage their skill does, sum(x[skill][:] * damage[skill] for skill in skills), under additional linear constraints (explained through numpy-like pseudocode):
for each time t, sum(x[:][t]) <= 1 (at each time you can use at most one ability)
for each ability a and time t0 sum(x[a][t0-cooldown(a):t0+cooldown(a)] <= 1 (within the period of its cooldown, you can only use your ability at most once)
Now the tricky part is that while this will give you a solution that is optimal in some sense, it will most likely not be physical, that is you'll get fractional xs. This is where the heuristic part kicks in, you have to find some way to "round" the solution to integers, losing objective value in the process, to make it physically (game-ally) meaningful. One way is to only keep x[a][t] == 1, and round all other numbers down to zero. It will give a meaningful solution, but it may not be very satisfying (ie. your character will do almost nothing). Given that my model for the problem is quite simple, I would expect there are some theoretical results on how to give a good rounding.
While I can suggest the scipy package for solving the linear program once it's formulated, the whole problem of building the constraint matrix and rounding the results (even trivially) is not a beginner-level programming task.

all (2^m−2)/2 possible ways to partition list

Each sample is an array of features (ints). I need to split my samples into two separate groups by figuring out what the best feature, and the best splitting value for that feature, is. By "best", I mean the split that gives me the greatest entropy difference between the pre-split set and the weighted average of the entropy values on the left and right sides. I need to try all (2^m−2)/2 possible ways to partition these items into two nonempty lists (where m is the number of distinct values (all samples with the same value for that feature are moved together as a group))
The following is extremely slow so I need a more reasonable/ faster way of doing this.
sorted_by_feature is a list of (feature_value, 0_or_1) tuples.
same_vals = {}
for ele in sorted_by_feature:
if ele[0] not in same_vals:
same_vals[ele[0]] = [ele]
else:
same_vals[ele[0]].append(ele)
l = same_vals.keys()
orderings = list(itertools.permutations(l))
for ordering in orderings:
list_tups = []
for dic_key in ordering:
list_tups += same_vals[dic_key]
left_1 = 0
left_0 = 0
right_1 = num_one
right_0 = num_zero
for index, tup in enumerate(list_tups):
#0's or #1's on the left +/- 1
calculate entropy on left/ right, calculate entropy drop, etc.
Trivial details (continuing the code above):
if index == len(sorted_by_feature) -1:
break
if tup[1] == 1:
left_1 += 1
right_1 -= 1
if tup[1] == 0:
left_0 += 1
right_0 -= 1
#only calculate entropy if values to left and right of split are different
if list_tups[index][0] != list_tups[index+1][0]:
tl;dr
You're asking for a miracle. No programming language can help you out of this one. Use better approaches than what you're considering doing!
Your Solution has Exponential Time Complexity
Let's assume a perfect algorithm: one that can give you a new partition in constant O(1) time. In other words, no matter what the input, a new partition can be generated in a guaranteed constant amount of time.
Let's in fact go one step further and assume that your algorithm is only CPU-bound and is operating under ideal conditions. Under ideal circumstances, a high-end CPU can process upwards of 100 billion instructions per second. Since this algorithm takes O(1) time, we'll say, oh, that every new partition is generated in a hundred billionth of a second. So far so good?
Now you want this to perform well. You say you want this to be able to handle an input of size m. You know that that means you need about pow(2,m) iterations of your algorithm - that's the number of partitions you need to generate, and since generating each algorithm takes a finite amount of time O(1), the total time is just pow(2,m) times O(1). Let's take a quick look at the numbers here:
m = 20 means your time taken is pow(2,20)*10^-11 seconds = 0.00001 seconds. Not bad.
m = 40 means your time taken is pow(2,40)10-11 seconds = 1 trillion/100 billion = 10 seconds. Also not bad, but note how small m = 40 is. In the vast panopticon of numbers, 40 is nothing. And remember we're assuming ideal conditions.
m = 100 means 10^41 seconds! What happened?
You're a victim of algorithmic theory. Simply put, a solution that has exponential time complexity - any solution that takes 2^m time to complete - cannot be sped up by better programming. Generating or producing pow(2,m) outputs is always going to take you the same proportion of time.
Note further that 100 billion instructions/sec is an ideal for high-end desktop computers - your CPU also has to worry about processes other than this program you're running, in which case kernel interrupts and context switches eat into processing time (especially when you're running a few thousand system processes, which you no doubt are). Your CPU also has to read and write from disk, which is I/O bound and takes a lot longer than you think. Interpreted languages like Python also eat into processing time since each line is dynamically converted to bytecode, forcing additional resources to be devoted to that. You can benchmark your code right now and I can pretty much guarantee your numbers will be way higher than the simplistic calculations I provide above. Even worse: storing 2^40 permutations requires 1000 GBs of memory. Do you have that much to spare? :)
Switching to a lower-level language, using generators, etc. is all a pointless affair: they're not the main bottleneck, which is simply the large and unreasonable time complexity of your brute force approach of generating all partitions.
What You Can Do Instead
Use a better algorithm. Generating pow(2,m) partitions and investigating all of them is an unrealistic ambition. You want, instead, to consider a dynamic programming approach. Instead of walking through the entire space of possible partitions, you want to only consider walking through a reduced space of optimal solutions only. That is what dynamic programming does for you. An example of it at work in a problem similar to this one: unique integer partitioning.
Dynamic programming problems approaches work best on problems that can be formulated as linearized directed acyclic graphs (Google it if not sure what I mean!).
If a dynamic approach is out, consider investing in parallel processing with a GPU instead. Your computer already has a GPU - it's what your system uses to render graphics - and GPUs are built to be able to perform large numbers of calculations in parallel. A parallel calculation is one in which different workers can do different parts of the same calculation at the same time - the net result can then be joined back together at the end. If you can figure out a way to break this into a series of parallel calculations - and I think there is good reason to suggest you can - there are good tools for GPU interfacing in Python.
Other Tips
Be very explicit on what you mean by best. If you can provide more information on what best means, we folks on Stack Overflow might be of more assistance and write such an algorithm for you.
Using a bare-metal compiled language might help reduce the amount of real time your solution takes in ordinary situations, but the difference in this case is going to be marginal. Compiled languages are useful when you have to do operations like searching through an array efficiently, since there's no instruction-compilation overhead at each iteration. They're not all that more useful when it comes to generating new partitions, because that's not something that removing the dynamic bytecode generation barrier actually affects.
A couple of minor improvements I can see:
Use try/catch instead of if not in to avoid double lookup of keys
if ele[0] not in same_vals:
same_vals[ele[0]] = [ele]
else:
same_vals[ele[0]].append(ele)
# Should be changed to
try:
same_vals[ele[0]].append(ele) # Most of the time this will work
catch KeyError:
same_vals[ele[0]] = [ele]
Dont explicitly convert a generator to a list if you dont have to. I dont immediately see any need for your casting to a list, which would slow things down
orderings = list(itertools.permutations(l))
for ordering in orderings:
# Should be changed to
for ordering in itertools.permutations(l):
But, like I said, these are only minor improvements. If you really needed this to be faster, consider using a different language.

Compute random number over certain time interval with Python

I did some research before posting but seem to be at a lost (not too experienced in coding).
I am attempting to generate or compute a random number for certain time interval with Python. I'm not looking for full code, I want help using the time library if that is the correct one to use.
Pseudo-code:
Allow python [PC] to compute a random number for 3 seconds
------> Store the computed generation in a value (i can handle this)
I would then use the random generated value to link access a python list (which would be automatically generated via a random number generation as well but i can figure that out).
I'm not sure why you want to do this, but here's how to compute many random numbers, throwing most of them away, and then using the last one after 3 seconds have elapsed.
import random
import time
start = time.clock()
while time.clock() - start < 3:
random_number = random.randint(0,100)
print random_number
This pointlessly throws away about 2 million perfectly good random numbers on my machine.
(And, as abarnert points out, this also maxes out one CPU core for the whole 3 seconds in a busy loop, which is very, very wasteful, but I thinks it's what you were asking for?)
EDIT: Updated to use time.clock instead of time.time, as suggested by abarnert again (thanks), because this seems to give better resolution across platforms and doesn't suffer from problems when the system time is altered in the middle of the program running.
First, you didn't say what kind of random number you want to generate, but given that your example is 10, I assume it's an integer in some range—let's say you're calling random.randrange(30).
Now, you want to compute a number every second for 3 seconds, then keep the last one. I don't know why you'd even want to do this, but you can do it like this:
for i in range(3):
number = random.randrange(30)
time.sleep(1.0)
At the end of 3 seconds, number will be the third random number generated.
The key here is that, to do something once per second (in a synchronous program—don't do this in a GUI or server!)—you just call time.sleep.
If the operation you were doing took a significant chunk of a second (or longer), this wouldn't be appropriate. Instead, you'd want to compute the start time, and sleep until a second after that:
t0 = time.monotonic()
for i in range(3):
number = random.randrange(30)
t0 += 1
time.sleep(t0 - time.monotonic())
Note that I've used time.monotonic here. This function is specifically designed for this kind of use case. It returns as much precision as can be gotten with reasonable efficiency (in particular, unlike time.time, it doesn't give you 1s precision on some platforms), and it guarantees that you'll never go backward even if, e.g., you change the system clock in the middle of the program. If you're using 3.2 or earlier, either look through the docs for the best alternative (possibly using time.clock()), or look into using ctypes to call the appropriate platform native function.
But in this case, random.randrange is going to take somewhere on the order of a microsecond, which is so much less time than the minimum resolution of most systems' simple timers that there's no reason to do such a thing.
If you want to take 3 seconds to get a random number, because you're concerned about the quality of the random number, you can use os.urandom() to generate the value. If all you really want to do is to select an item from your list at random, you can use random.choice()
Note: The function time.clock() has been removed, after having been deprecated since Python 3.3: use time.perf_counter() or time.process_time() instead, depending on your requirements, to have well-defined behavior. (Contributed by Matthias Bussonnier in bpo-36895.)

the best way to calculate the position of a number in prime list?

for example
f(2)->1
f(3)->2
f(4)->-1 //4 is not a prime
f(5)->3
...
generally ,make a prime generator and count before it reach x
def f(x):
p = primeGenerator()
count=1
while True:
y = next(p)
if y>x:
return -1
elif y==x:
return count
else:
count+=1
wasn't it too slow?though i can cache the list for next call,if i guarantee the input MUST be a prime,so don't have to test if the input number is a prime, is there a faster formula to get the answer?
The best method depends on what inputs you get, and whether the function will be called many times or just once or a few times.
If it will be called often, and all inputs you are going to receive are small, not larger than 107 say, the best method is to create a lookup table in advance, and just look up the input.
If it will not be called often, and all inputs are small, just generating the primes not exceeding the input and counting them is certainly good enough. It might be an enhancement to remember what you already have for the next call, so that when the first argument is 19394489, and the next is 20889937, you don't need to start from 0 again, but only need to find the primes between them. But whether the extra storage is worth to be had depends on the arguments passed.
If it will be called often and the arguments are not too large, not exceeding 1013 say, the best method is to precompute the values of π(n) for some select values of n, and for each argument look up the value for the next smaller precomputed point, and then generate and count the primes between that point and the target value (or if the target is closer to the next larger precomputed point, count the primes between the target and that).
If you calculate e.g. π(n) for all multiples of 107 not exceeding 1013, you get a lookup table with one million entries, that's not very taxing on the memory nowadays, and never need to sieve a range larger than five million, which doesn't take long.
You could also have the lookup table as a file or database on disk, which would allow much shorter intervals between the precomputed points. That would also eliminate the time for reading in the precomputed table on startup, but the lookup would now involve an access to the file system, which takes much longer than a memory read. What would be the best strategy depends on the expected inputs and the system it's run on.
Computing the lookup table will however take rather long if the upper limit isn't small, but that's a one-time cost.
If the expected inputs are larger, up to 1016 say, and you're not willing to spend the time necessary for precomputing a lookup table for that range, your best bet is to implement one of the better algorithms for the prime counting function, Meissel's method as refined by Lehmer is relatively easy to implement (not so easy that I'll give an example implementation here, though, but here's a Haskell implementation that might help). Better, but more complicated is the method as improved by Miller et al.
Beyond that, you'd need to research the current state-of-the-art, and probably should use a lower-level language than Python.
You have to check all preceding candidates for primality. There are no shortcuts. As you say, you can cache the result of a prior calculation and start from there, but that's really the best you can do.

Categories

Resources