Is possible to make a binary search by searching between unknown values? - python

I have a function (we can call it f(x)), that will give me a number.
The value of x is between 0 and 1: f(0) will find the biggest number, f(1), the smallest.
But I don't know if, for example, f(0.2) will give me a different number from f(0); so I have to do a research to find all the number using binary search.
I know that I can iterate from x = 0 to x = 1, but I want to do less function call possible.
Have you any suggest?
I can start by calling f(0), f(1), f(0.5), and then f(0.25) or f(0.75) and so on and so on.
(Mathematically I can divide x endlessly, here I can choose a precision limit)

First you must be sure that the function is monotone. If this is not sure you cannot use a binary search.
Secondly define the accuracy you want.
Than perform the binary search until no improvement is made or the desired accuracy is reached.

Related

How to generate a biased coin with 1/3 probability of success using only "fair coins"

This is a problem on my homework:
Write a function (that takes no parameters) and generates a biased coin using only "fair coins"
I've found a solution which requires using a binary digit stream, since 1/3 in binary is 0.010101...
However, I'm wondering if there's a way to do solve this without using a binary digit stream? Here's the code for using a binary digit stream
def fairCoin():
return random.choice([0,1])
def oneThird():
while True:
yield 0
yield 1
def biasedCoin(binaryDigitStream, fairCoin):
for d in binaryDigitStream:
if fairCoin() != d:
return d
Hopefully I've understood the question correctly.
I'm going to use T for success, F for failure. I've put the next bit in a code box to format correctly. It shows what the outcome is and what the cumulative probabilities are if you reflip only for certain results.
. T % F %
First fair coin flip 50% 50%
Reflip only if the result was T 25% 75%
Reflip AGAIN only if F 37.5% 62.5%
Reflip AGAIN only if T 31.25% 68.75%
etc, etc
Do you see where I'm going here? If you do, go code it now.
You need a Reflip function that takes as arguments the result of the last flip and a fair coin. It flips the coin and hands back the result if it's the same as the last result, otherwise it calls itself again with the new result. The first call to it should use F as the initial result. Theoretically, the function could go on infinitely, but that's what you need to generate 1/3 since it's not rational.

Binary search to find the highest possible value in Python

I have a very large number (20 digits), which I need to find. So in a range between 0 and 99999999999999999999.
I can perform a check if the number is larger or smaller than the guessed number, so for example:
is_smaller(12341234123412341234)
# True
is_smaller(98769876987698769876)
# False
However, how the function is_smaller works is unknown, but the value for a number is constant.
Could this be solved with a binary search - I'm not quite sure how I can implement this as I only ever know If the number is smaller/larger.
Most implementations of the binary search I've come across, use it to find a given number in an array, which doesn't work for me as the number is unknown.
How could I use binary search in this scenario, or would another method be better suited?
The goal is to find the highest possible value, that still returns True for is_smaller.
edit: I do not have a way of telling if the number is bigger, so I have no is_bigger function. So in a smaller range (e.g. 0 to 10), if the number of interest is 6, the function I have would return:
[...]
is_smaller(4)
# True
is_smaller(5)
# True
is_smaller(6)
# True
is_smaller(7)
# False
is_smaller(8)
# False
I have to admit the functions name in the question was very poorly chosen.
If something is neither bigger nor smaller than the number you're looking for, it's the number you're looking for.
def is_answer(n):
return not is_smaller(n) and not is_larger(n)
Now you can use standard binary search; just replace conditionals that look like
if x == search_term:
if x < search_term:
if x > search_term:
With
if is_answer(x):
if is_smaller(x):
if is_larger(x):
Respectively. If you want a <= or >= operator for your binary search, you can construct it yourself from these building blocks.
Binary search splits a range from lower_boundary .. higher_boundary to the range lower_boundary .. (lower_boundary + higher_boundary) // 2 or (lower_boundary + higher_boundary) // 2 + 1 .. lower_boundary depending on the outcome of your is_smaller function.

Pseudorandom Algorithm for VERY Large (10^1.2mil) Numbers?

I'm looking for a pseudo-random number generator (an algorithm where you input a seed number and it outputs a different 'random-looking' number, and the same seed will always generate the same output) for numbers between 1 and 951,312,000.
I would use the Linear Feedback Shift Register (LFSR) PRNG, but if I did, I would have to convert the seed number (which could be up to 1.2 million digits long in base-10) into a binary number, which would be so massive that I think it would take too long to compute.
In response to a similar question, the Feistel cipher was recommended, but I didn't understand the vocabulary of the wiki page for that method (I'm going into 10th grade so I don't have a degree in encryption), so if you could use layman's terms, I would strongly appreciate it.
Is there an efficient way of doing this which won't take until the end of time, or is this problem impossible?
Edit: I forgot to mention that the prng sequence needs to have a full period. My mistake.
A simple way to do this is to use a linear congruential generator with modulus m = 95^1312000.
The formula for the generator is x_(n+1) = a*x_n + c (mod m). By the Hull-Dobell Theorem, it will have full period if and only if gcd(m,c) = 1 and 95 divides a-1. Furthermore, if you want good second values (right after the seed) even for very small seeds, a and c should be fairly large. Also, your code can't store these values as literals (they would be much too big). Instead, you need to be able to reliably produce them on the fly. After a bit of trial and error to make sure gcd(m,c) = 1, I hit upon:
import random
def get_book(n):
random.seed(1941) #Borges' Library of Babel was published in 1941
m = 95**1312000
a = 1 + 95 * random.randint(1, m//100)
c = random.randint(1, m - 1) #math.gcd(c,m) = 1
return (a*n + c) % m
For example:
>>> book = get_book(42)
>>> book % 10**100
4779746919502753142323572698478137996323206967194197332998517828771427155582287891935067701239737874
shows the last 100 digits of "book" number 42. Given Python's built-in support for large integers, the code runs surprisingly fast (it takes less than 1 second to grab a book on my machine)
If you have a method that can produce a pseudo-random digit, then you can concatenate as many together as you want. It will be just as repeatable as the underlying prng.
However, you'll probably run out of memory scaling that up to millions of digits and attempting to do arithmetic. Normally stuff on that scale isn't done on "numbers". It's done on byte vectors, or something similar.

What's the difference between randomly picking a 5-digit number, and picking each digit individually?

Is there any difference whatsoever between using random.randrange to pick 5 digits individually, like this:
a=random.randrange(0,10)
b=random.randrange(0,10)
c=random.randrange(0,10)
d=random.randrange(0,10)
e=random.randrange(0,10)
print (a,b,c,d,e)
...and picking the 5-digit number at once, like this:
x=random.randrange(0, 100000)
print (x)
Any random-number-generator differences (if any --- see the section on Randomness) are minuscule compared to the utility and maintainability drawbacks of the digit-at-a-time method.
For starters, generating each digit would require a lot more code to handle perfectly normal calls like randrange(0, 1024) or randrange(0, 2**32), where the digits do not arise in equal probability. For example, on the closed-closed range [0,1023] (requiring 4 digits), the first digit of the four can never be anything other than 0 or 1. The last digit is slightly more likely to be a 0, 1, 2, or 3. And so on.
Trying to cover all the bases would rapidly make that code slower, more bug-prone, and more brittle than it already is. (The number of annoying little details you've encountered just posting this question should give you an idea what lies farther down that path.)
...and all that grief is before you consider how easily random.randrange handles non-zero start values, the step parameter, and negative arguments.
Randomness Problems
If your RNG is good, your alternative method should produce "equally random" results (assuming you've handled all the problems I mentioned above). However, if your RNG is biased, then the digit-at-a-time method will probably increase its effect on your outputs.
For demonstration purposes, assume your absurdly biased RNG has an off-by-one error, so that it never produces the last value of the given range:
The call randrange(0, 2**32) will never produce 2**32 - 1 (4,294,967,295), but the remaining 4-billion-plus values will appear in very nearly their expected probability. Its output over millions of calls would be very hard to distinguish from a working pseudo-random number generator.
Producing the ten digits of that same supposedly-random number individually will subject each digit to that same off-by-one error, resulting in a ten-digit output that consists entirely of the digits [0,8], with no 9s present... ever. This is vastly "less random" than generating the whole number at once.
Conversely, the digit-at-a-time method will never be better than the RNG backing it, even when the range requested is very small. That method might magnify any RNG bias, or just repeat that bias, but it will never reduce it.
Yes, no and no.
Yes: probabilities multiply, so the digit sequences have the same probability
prob(a) and prob(b) = prob(a) * prob(b)
Since each digit has 0.1 chance of appear, the probability of two particular digits in order is 0.1**2, or 0.01, which is the probability of a number between 0 and 99 inclusive.
No: you have a typo in your second number.
The second form only has four digits; you probably meant randrange(0, 100000)
No: the output will not be the same
The second form will not print leading digits; you could print("%05d"%x) to get all the digits. Also, the first form has spaces in the output, so you could instead print("%d%d%d%d%d"%(a,b,c,d,e)).

Solving recursive sequence

Lately I've been solving some challenges from Google Foobar for fun, and now I've been stuck in one of them for more than 4 days. It is about a recursive function defined as follows:
R(0) = 1
R(1) = 1
R(2) = 2
R(2n) = R(n) + R(n + 1) + n (for n > 1)
R(2n + 1) = R(n - 1) + R(n) + 1 (for n >= 1)
The challenge is writing a function answer(str_S) where str_S is a base-10 string representation of an integer S, which returns the largest n such that R(n) = S. If there is no such n, return "None". Also, S will be a positive integer no greater than 10^25.
I have investigated a lot about recursive functions and about solving recurrence relations, but with no luck. I outputted the first 500 numbers and I found no relation with each one whatsoever. I used the following code, which uses recursion, so it gets really slow when numbers start getting big.
def getNumberOfZombits(time):
if time == 0 or time == 1:
return 1
elif time == 2:
return 2
else:
if time % 2 == 0:
newTime = time/2
return getNumberOfZombits(newTime) + getNumberOfZombits(newTime+1) + newTime
else:
newTime = time/2 # integer, so rounds down
return getNumberOfZombits(newTime-1) + getNumberOfZombits(newTime) + 1
The challenge also included some test cases so, here they are:
Test cases
==========
Inputs:
(string) str_S = "7"
Output:
(string) "4"
Inputs:
(string) str_S = "100"
Output:
(string) "None"
I don't know if I need to solve the recurrence relation to anything simpler, but as there is one for even and one for odd numbers, I find it really hard to do (I haven't learned about it in school yet, so everything I know about this subject is from internet articles).
So, any help at all guiding me to finish this challenge will be welcome :)
Instead of trying to simplify this function mathematically, I simplified the algorithm in Python. As suggested by #LambdaFairy, I implemented memoization in the getNumberOfZombits(time) function. This optimization sped up the function a lot.
Then, I passed to the next step, of trying to see what was the input to that number of rabbits. I had analyzed the function before, by watching its plot, and I knew the even numbers got higher outputs first and only after some time the odd numbers got to the same level. As we want the highest input for that output, I first needed to search in the even numbers and then in the odd numbers.
As you can see, the odd numbers take always more time than the even to reach the same output.
The problem is that we could not search for the numbers increasing 1 each time (it was too slow). What I did to solve that was to implement a binary search-like algorithm. First, I would search the even numbers (with the binary search like algorithm) until I found one answer or I had no more numbers to search. Then, I did the same to the odd numbers (again, with the binary search like algorithm) and if an answer was found, I replaced whatever I had before with it (as it was necessarily bigger than the previous answer).
I have the source code I used to solve this, so if anyone needs it I don't mind sharing it :)
The key to solving this puzzle was using a binary search.
As you can see from the sequence generators, they rely on a roughly n/2 recursion, so calculating R(N) takes about 2*log2(N) recursive calls; and of course you need to do it for both the odd and the even.
Thats not too bad, but you need to figure out where to search for the N which will give you the input. To do this, I first implemented a search for upper and lower bounds for N. I walked up N by powers of 2, until I had N and 2N that formed the lower and upper bounds respectively for each sequence (odd and even).
With these bounds, I could then do a binary search between them to quickly find the value of N, or its non-existence.

Categories

Resources