I'm stuck in Codewars Kata, I hope someone could help me out (without spoiling the solution).
In fact the problem is that I didn't fully understand how it should work, I got the idea of the exercise but things are a bit confusing especially in Sample tests.
Here are the instructions:
The number 81 has a special property, a certain power of the sum of its digits is equal to 81 (nine squared). Eighty one (81), is the first number in having this property (not considering numbers of one digit). The next one, is 512. Let's see both cases with the details.
8 + 1 = 9 and 9^2 = 81
512 = 5 + 1 + 2 = 8 and 8^3 = 512
We need to make a function, power_sumDigTerm(), that receives a number n and may output the nth term of this sequence of numbers. The cases we presented above means that:
power_sumDigTerm(1) == 81
power_sumDigTerm(2) == 512
And here are the sample tests:
test.describe("Example Tests")
test.it("n = " + str(1))
test.assert_equals(power_sumDigTerm(1), 81)
test.it("n = " + str(2))
test.assert_equals(power_sumDigTerm(2), 512)
test.it("n = " + str(3))
test.assert_equals(power_sumDigTerm(3), 2401)
test.it("n = " + str(4))
test.assert_equals(power_sumDigTerm(4), 4913)
test.it("n = " + str(5))
test.assert_equals(power_sumDigTerm(5), 5832)
My main problem is how did they get the results for the sample tests.
A good speed up trick is to not check all numbers, Any such number must be of the form a^b for integers a and b. If you find a way to enumerate those and check them you will have a fairly efficient solution.
On f(5), the sum of the numbers is 5+8+3+2 = 18. And 18^3=5832.
Brute force method would look like this for the next one:
Start at 5833, add the digits, check the powers of the sum to see if you get number. This would actually be very fast as you can see that last one only got to ^3. As soon as the power is larger than the number you are seeking, move on to the next number: 5834. When you find one, insert into a table to remember it.
The number theory experts may be able to find a more efficient method but this brute force method is likely to be pretty fast.
Grab a prime generator; you need only prime powers to generate the sequence (although you will have all integers >= 2 in the test for inclusion). This is because if a number is a composite power, it's also a prime power.
Maintain a list of current powers, indexed by the base integer. For instance, once you've made it to a limit of 100, you'll have the list
[0, 0, 64, 81, 64, 25, 36, 49, 64, 81, 100]
// Highest power no greater than the current limit
... and the current list of target numbers has only one element: [81]
Extending the limit:
Pick the lowest number in the list (25 = 5^2, in this case)
Multiply by its base: 25 => 125
Check: is 1+2+5 a root of 125? (There are minor ways to speed this up)
If so, add 125 to the list
Now, go back to all lower integers (the range [2, 5-1] ), and add any smaller prime powers of those integers. (I haven't worked out whether there can ever be more than one power to add for a given integer; that's an interesting prime-based problem.)
Whenever you add a new target number, make sure you insert in numerical order; the step in the previous paragraph may introduce a "hit" lower than the one that triggered the iteration. For instance, this could append 5832 before in finds 4913 (I haven't coded and executed this algorithm). You could collect all the added target numbers, sort that list, and append them as a block.
Although complicated, I believe that this will be notably faster than the brute-force methods given elsewhere.
Related
I have a number which is 615 digits in length. Throughout the number, there 8 fixed places where a digit is missing. I have to find what those missing digits are. So there are 10^8 possibilities. After computing them I have to raise a ciphetext to each possible number, and see what the output is (mod N), and see which number gives the correct output. In other words, I am trying to find the decryption key in an RSA problem. My main concern right now is how to efficiently/properly create all 10^8 possible answers.
I am using gmpy2, and to get that to work, I had to download Python2.7 just to not get an error when trying to install gmpy2. I hope they are adequate enough to tackle this problem. If not, I would really appreciate someone pointing me in the correct direction.
I have not tried anything yet, as Im sure this will take hours to compute. So I really want to make sure I am doing everything correct so that if I let my laptop run for a couple hours, I do not mess up the insides, nor will it freeze and I will be sitting here not knowing if my laptop messed up, or if its still computing.
So I suppose I am trying to seek advice on how I should proceed further.
In terms of actual code, I suppose looping through 0-9 8 times is not that hard, but I dont know how to a number into another number. In Python, how do I make it so that a number will only be inserted into the position I need it to? The number looks like this example:
X = 124621431523_13532535_62635292 //this is only 30 digits long, mine is 615 digits long
where each "_" is where a number is missing.
I am completely at a loss on how to do this.
Once all the numbers are generated, I aim to loop through them all and raise them until I get the answer required. This part seems to be a bit easier, as it seems like just a simple loop.
So I guess my main question is how to loop through 10^8 numbers but placing them in a specific spot inside a number that is already 615 digits long? I am seeking advice on technical as well as code design so as to not take too long to generate them all.
Thank you for reading.
Turn the number into a string, use the format method, use itertools.product to generate numbers to fill the holes, then turn it back.
Example:
from itertools import product
def make_seed(n, replace_positions):
seed = str(n)
to_replace = [-1] + replace_positions
substrings = [seed[start + 1:end]
for start, end
in zip(to_replace, to_replace[1:])] + [seed[to_replace[-1] + 1:]]
return '{}'.join(substrings)
def make_number(seed):
n = seed.count('{}')
for numbers in product([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], repeat=n):
yield int(seed.format(*numbers))
seed = make_seed(123456789, [3, 5, 7])
# seed = '123{}5{}7{}9'
for i in make_number(seed):
print(i)
Output:
123050709
123050719
123050729
123050739
123050749
123050759
123050769
123050779
123050789
123050799
123051709
123051719
123051729
...
Since a decimal digit is just summation of digit * pow(10, n), you can assume the unknown digits to be zero, and add it with the digit-products
# 124621431523_13532535_62635292 this is the original digit
x = 124621431523013532535062635292
positions = [8,17] # the missing digits are the 8th and 17th digits from the right
from itertools import product
trials = product(range(0,10), repeat=2)
for t in trials:
x_prime = x
for (digit, pos) in zip(t, positions):
x_prime = x_prime + digit * pow(10, pos)
print(x_prime) # do your checking here
outputs:
124621431523013532535062635292
124621431523113532535062635292
124621431523213532535062635292
124621431523313532535062635292
...
etc
I'm trying to gather some statistics on prime numbers, among which is the distribution of factors for the number (prime-1)/2. I know there are general formulas for the size of factors of uniformly selected numbers, but I haven't seen anything about the distribution of factors of one less than a prime.
I've written a program to iterate through primes starting at the first prime after 2^63, and then factor the (prime - 1)/2 using trial division by all primes up to 2^32. However, this is extremely slow because that is a lot of primes (and a lot of memory) to iterate through. I store the primes as a single byte each (by storing the increment from one prime to the next). I also use a deterministic variant of the Miller-Rabin primality test for numbers up to 2^64, so I can easily detect when the remaining value (after a successful division) is prime.
I've experimented using a variant of pollard-rho and elliptic curve factorization, but it is hard to find the right balance of between trial division and switching to these more complicated methods. Also I'm not sure I've implemented them correctly, because sometimes they seem to take a very lone time to find a factor, and based on their asymptotic behavior, I'd expect them to be quite quick for such small numbers.
I have not found any information on factoring many numbers (vs just trying to factor one), but it seems like there should be some way to speed up the task by taking advantage of this.
Any suggestions, pointers to alternate approaches, or other guidance on this problem is greatly appreciated.
Edit:
The way I store the primes is by storing an 8-bit offset to the next prime, with the implicit first prime being 3. Thus, in my algorithms, I have a separate check for division by 2, then I start a loop:
factorCounts = collections.Counter()
while N % 2 == 0:
factorCounts[2] += 1
N //= 2
pp = 3
for gg in smallPrimeGaps:
if pp*pp > N:
break
if N % pp == 0:
while N % pp == 0:
factorCounts[pp] += 1
N //= pp
pp += gg
Also, I used a wheel sieve to calculate the primes for trial division, and I use an algorithm based on the remainder by several primes to get the next prime after the given starting point.
I use the following for testing if a given number is prime (porting code to c++ now):
bool IsPrime(uint64_t n)
{
if(n < 341531)
return MillerRabinMulti(n, {9345883071009581737ull});
else if(n < 1050535501)
return MillerRabinMulti(n, {336781006125ull, 9639812373923155ull});
else if(n < 350269456337)
return MillerRabinMulti(n, {4230279247111683200ull, 14694767155120705706ull, 1664113952636775035ull});
else if(n < 55245642489451)
return MillerRabinMulti(n, {2ull, 141889084524735ull, 1199124725622454117, 11096072698276303650});
else if(n < 7999252175582851)
return MillerRabinMulti(n, {2ull, 4130806001517ull, 149795463772692060ull, 186635894390467037ull, 3967304179347715805ull});
else if(n < 585226005592931977)
return MillerRabinMulti(n, {2ull, 123635709730000ull, 9233062284813009ull, 43835965440333360ull, 761179012939631437ull, 1263739024124850375ull});
else
return MillerRabinMulti(n, {2ull, 325ull, 9375ull, 28178ull, 450775ull, 9780504ull, 1795265022ull});
}
I don't have a definitive answer, but I do have some observations and some suggestions.
There are about 2*10^17 primes between 2^63 and 2^64, so any program you write is going to run for a while.
Let's talk about a primality test for numbers in the range 2^63 to 2^64. Any general-purpose test will do more work than you need, so you can speed things up by writing a special-purpose test. I suggest strong-pseudoprime tests (as in Miller-Rabin) to bases 2 and 3. If either of those tests shows the number is composite, you're done. Otherwise, look up the number (binary search) in a table of strong-pseudoprimes to bases 2 and 3 (ask Google to find those tables for you). Two strong pseudoprime tests followed by a table lookup will certainly be faster than the deterministic Miller-Rabin test you are currently performing, which probably uses six or seven bases.
For factoring, trial division to 1000 followed by Brent-Rho until the product of the known prime factors exceeds the cube root of the number being factored ought to be fairly fast, a few milliseconds. Then, if the remaining cofactor is composite, it will necessarily have only two factors, so SQUFOF would be a good algorithm to split them, faster than the other methods because all the arithmetic is done with numbers less than the square root of the number being factored, which in your case means the factorization could be done using 32-bit arithmetic instead of 64-bit arithmetic, so it ought to be fast.
Instead of factoring and primality tests, a better method uses a variant of the Sieve of Eratosthenes to factor large blocks of numbers. That will still be slow, as there are 203 million sieving primes less than 2^32, and you will need to deal with the bookkeeping of a segmented sieve, but considering that you factor lots of numbers at once, it's probably the best approach to your task.
I have code for everything mentioned above at my blog.
This is how I store primes for later:
(I'm going to assume you want the factors of the number, and not just a primality test).
Copied from my website http://chemicaldevelopment.us/programming/2016/10/03/PGS.html
I’m going to assume you know the binary number system for this part. If not, just think of 1 as a “yes” and 0 as a “no”.
So, there are plenty of algorithms to generate the first few primes. I use the Sieve of Eratosthenes to compute a list.
But, if we stored the primes as an array, like [2, 3, 5, 7] this would take up too much space. How much space exactly?
Well, 32 bit integers which can store up to 2^32 each take up 4 bytes because each byte is 8 bits, and 32 / 8 = 4
If we wanted to store each prime under 2,000,000,000, we would have to store over 98,000,000,000. This takes up more space, and is slower at runtime than a bitset, which is explained below.
This approach will take 98,000,000 integers of space (each is 32 bits, which is 4 bytes), and when we check at runtime, we will need to check every integer in the array until we find it, or we find a number that is greater than it.
For example, say I give you a small list of primes: [2, 3, 5, 7, 11, 13, 17, 19]. I ask you if 15 is prime. How do you tell me?
Well, you would go through the list and compare each to 15.
Is 2 = 15?
Is 3 = 15?
. . .
Is 17 = 15?
At this point, you can stop because you have passed where 15 would be, so you know it isn’t prime.
Now then, let’s say we use a list of bits to tell you if the number is prime. The list above would look like:
001101010001010001010
This starts at 0, and goes to 19
The 1s mean that the index is prime, so count from the left: 0, 1, 2
001101010001010001010
The last number in bold is 1, which indicates that 2 is prime.
In this case, if I asked you to check if 15 is prime, you don’t need to go through all the numbers in the list; All you need to do is skip to 0 . . . 15, and check that single bit.
And for memory usage, the first approach uses 98000000 integers, whereas this one can store 32 numbers in a single integer (using the list of 1s and 0s), so we would need
2000000000/32=62500000 integers.
So it uses about 60% as much memory as the first approach, and is much faster to use.
We store the array of integers from the second approach in a file, then read it back when you run.
This uses 250MB of ram to store data on the first 2000000000 primes.
You can further reduce this with wheel sieving (like what you did storing (prime-1)/2)
I'll go a little bit more into wheel sieve.
You got it right by storing (prime - 1)/2, and 2 being a special case.
You can extend this to p# (the product of the first p primes)
For example, you use (1#)*k+1 for numbers k
You can also use the set of linear equations (n#)*k+L, where L is the set of primes less than n# and 1 excluding the first n primes.
So, you can also just store info for 6*k+1 and 6*k+5, and even more than that, because L={1, 2, 3, 5}{2, 3}
These methods should give you an understanding of some the methods behind it.
You will need someway to implement this bitset, such as a list of 32 bit integers, or a string.
Look at: https://pypi.python.org/pypi/bitarray for a possible abstraction
Is there any difference whatsoever between using random.randrange to pick 5 digits individually, like this:
a=random.randrange(0,10)
b=random.randrange(0,10)
c=random.randrange(0,10)
d=random.randrange(0,10)
e=random.randrange(0,10)
print (a,b,c,d,e)
...and picking the 5-digit number at once, like this:
x=random.randrange(0, 100000)
print (x)
Any random-number-generator differences (if any --- see the section on Randomness) are minuscule compared to the utility and maintainability drawbacks of the digit-at-a-time method.
For starters, generating each digit would require a lot more code to handle perfectly normal calls like randrange(0, 1024) or randrange(0, 2**32), where the digits do not arise in equal probability. For example, on the closed-closed range [0,1023] (requiring 4 digits), the first digit of the four can never be anything other than 0 or 1. The last digit is slightly more likely to be a 0, 1, 2, or 3. And so on.
Trying to cover all the bases would rapidly make that code slower, more bug-prone, and more brittle than it already is. (The number of annoying little details you've encountered just posting this question should give you an idea what lies farther down that path.)
...and all that grief is before you consider how easily random.randrange handles non-zero start values, the step parameter, and negative arguments.
Randomness Problems
If your RNG is good, your alternative method should produce "equally random" results (assuming you've handled all the problems I mentioned above). However, if your RNG is biased, then the digit-at-a-time method will probably increase its effect on your outputs.
For demonstration purposes, assume your absurdly biased RNG has an off-by-one error, so that it never produces the last value of the given range:
The call randrange(0, 2**32) will never produce 2**32 - 1 (4,294,967,295), but the remaining 4-billion-plus values will appear in very nearly their expected probability. Its output over millions of calls would be very hard to distinguish from a working pseudo-random number generator.
Producing the ten digits of that same supposedly-random number individually will subject each digit to that same off-by-one error, resulting in a ten-digit output that consists entirely of the digits [0,8], with no 9s present... ever. This is vastly "less random" than generating the whole number at once.
Conversely, the digit-at-a-time method will never be better than the RNG backing it, even when the range requested is very small. That method might magnify any RNG bias, or just repeat that bias, but it will never reduce it.
Yes, no and no.
Yes: probabilities multiply, so the digit sequences have the same probability
prob(a) and prob(b) = prob(a) * prob(b)
Since each digit has 0.1 chance of appear, the probability of two particular digits in order is 0.1**2, or 0.01, which is the probability of a number between 0 and 99 inclusive.
No: you have a typo in your second number.
The second form only has four digits; you probably meant randrange(0, 100000)
No: the output will not be the same
The second form will not print leading digits; you could print("%05d"%x) to get all the digits. Also, the first form has spaces in the output, so you could instead print("%d%d%d%d%d"%(a,b,c,d,e)).
Lately I've been solving some challenges from Google Foobar for fun, and now I've been stuck in one of them for more than 4 days. It is about a recursive function defined as follows:
R(0) = 1
R(1) = 1
R(2) = 2
R(2n) = R(n) + R(n + 1) + n (for n > 1)
R(2n + 1) = R(n - 1) + R(n) + 1 (for n >= 1)
The challenge is writing a function answer(str_S) where str_S is a base-10 string representation of an integer S, which returns the largest n such that R(n) = S. If there is no such n, return "None". Also, S will be a positive integer no greater than 10^25.
I have investigated a lot about recursive functions and about solving recurrence relations, but with no luck. I outputted the first 500 numbers and I found no relation with each one whatsoever. I used the following code, which uses recursion, so it gets really slow when numbers start getting big.
def getNumberOfZombits(time):
if time == 0 or time == 1:
return 1
elif time == 2:
return 2
else:
if time % 2 == 0:
newTime = time/2
return getNumberOfZombits(newTime) + getNumberOfZombits(newTime+1) + newTime
else:
newTime = time/2 # integer, so rounds down
return getNumberOfZombits(newTime-1) + getNumberOfZombits(newTime) + 1
The challenge also included some test cases so, here they are:
Test cases
==========
Inputs:
(string) str_S = "7"
Output:
(string) "4"
Inputs:
(string) str_S = "100"
Output:
(string) "None"
I don't know if I need to solve the recurrence relation to anything simpler, but as there is one for even and one for odd numbers, I find it really hard to do (I haven't learned about it in school yet, so everything I know about this subject is from internet articles).
So, any help at all guiding me to finish this challenge will be welcome :)
Instead of trying to simplify this function mathematically, I simplified the algorithm in Python. As suggested by #LambdaFairy, I implemented memoization in the getNumberOfZombits(time) function. This optimization sped up the function a lot.
Then, I passed to the next step, of trying to see what was the input to that number of rabbits. I had analyzed the function before, by watching its plot, and I knew the even numbers got higher outputs first and only after some time the odd numbers got to the same level. As we want the highest input for that output, I first needed to search in the even numbers and then in the odd numbers.
As you can see, the odd numbers take always more time than the even to reach the same output.
The problem is that we could not search for the numbers increasing 1 each time (it was too slow). What I did to solve that was to implement a binary search-like algorithm. First, I would search the even numbers (with the binary search like algorithm) until I found one answer or I had no more numbers to search. Then, I did the same to the odd numbers (again, with the binary search like algorithm) and if an answer was found, I replaced whatever I had before with it (as it was necessarily bigger than the previous answer).
I have the source code I used to solve this, so if anyone needs it I don't mind sharing it :)
The key to solving this puzzle was using a binary search.
As you can see from the sequence generators, they rely on a roughly n/2 recursion, so calculating R(N) takes about 2*log2(N) recursive calls; and of course you need to do it for both the odd and the even.
Thats not too bad, but you need to figure out where to search for the N which will give you the input. To do this, I first implemented a search for upper and lower bounds for N. I walked up N by powers of 2, until I had N and 2N that formed the lower and upper bounds respectively for each sequence (odd and even).
With these bounds, I could then do a binary search between them to quickly find the value of N, or its non-existence.
I am trying to make a code that does the following:
Multiplying the digits of an integer and continuing the process gives
the surprising result that the sequence of products always arrives at
a single-digit number.
For example:
715 -> 35 -> 15 -> 5
88 -> 64 -> 24 -> 8
27 -> 14 -> 4
The number of products necessary to reach the single-digit
number is called the persistence number of that integer. Thus 715
and 88 have a persistence number of 3, while 27 has persistence 2.
Make a program to find the only two-digit number with persistence
greater than 3?
I was able to come up with a rough idea and the code is below but it doesn't seem to work:
num2=0
num3=0
num4=0
num=input("what is your number?")
while num in range(10,100):
print 'step1'
num1=num%10*num/10
if num1-10>10:
print 'step2'
num2=num1%10*num1/10
elif num2-num1>10:
print 'step3'
num3=num2%10*num2/10
elif num3-num2>10:
print 'step4'
num4=num3%10*num3/10
elif num4-num3>10:
print 'step5'
print num4
else:
break
The program is Python and I simply can't figure this out. If someone could possibly help me I would appreciate it greatly!
You should use a while or for loop to multiply the digits instead of hardcoding what to do with the first, second and so on digits.
In pseudocode...
productSoFar = 1
digitsLeftToMultipy = #the number
while there are digits left to multiply:
get the next digit and
update produtsSoFar and digitsLeftToMultiply
Also, use
10 <= n < 100
instead of
n in range(10, 100)
So you only do a couple of comparisons instead of a sequential lookup that takes time proportional to the length of the range.
Functions are friends.
Consider a function, getEnds(x), which when passed an integer, x will extract the first digit and the last digit (as integers) and return the result as a tuple in the form (first_digit, last_digit). If x is a single-digit number the tuple will contain one element and be in the form (x), otherwise it will be two. (A simple way to do this is to turn the number into a string, extract the first/last digit as a string, and then convert said strings back into numbers... however, there are many ways: just make sure to honor the function contract, as stated above and -- hopefully -- in the function documentation.)
Then, where n is the current number we are finding the persistence for:
ends = getEnds(n)
while ends contains two elements
n = first element of ends times second element of ends
ends = getEnds(n)
# while terminates when ends contained only one element
# now it's only a matter of "counting" the persistence
For added points, make sure this is in a -- [an] appropriately named/documented -- function as well and consider the use of a recursive function instead of a while-loop.
Happy coding.
If you're trying to get the digits of a number, convert it into a string first and reference them with array notation.