ok so I am feeling a little stupid for not knowing this, but a coworker asked so I am asking here: I have written a python algorithm that solves his problem. given x > 0 add all numbers together from 1 to x.
def intsum(x):
if x > 0:
return x + intsum(x - 1)
else:
return 0
intsum(10)
55
first what is this type of equation is this and what is the correct way to get this answer as it is clearly easier using some other method?
This is recursion, though for some reason you're labeling it like it's factorial.
In any case, the sum from 1 to n is also simply:
n * ( n + 1 ) / 2
(You can special case it for negative values if you like.)
Transforming recursively-defined sequences of integers into ones that can be expressed in a closed form is a fascinating part of discrete mathematics -- I heartily recommend Concrete Mathematics: A Foundation for Computer Science, by Ronald Graham, Donald Knuth, and Oren Patashnik (see. e.g. the wikipedia entry about it).
However, the specific sequence you show, fac(x) = fac(x - 1) + x, according to a famous anecdote, was solved by Gauss when he was a child in first grade -- the teacher had given the pupils the taksk of summing numbers from 1 to 100 to keep them quet for a while, but two minutes later there was young Gauss with the answer, 5050, and the explanation: "I noticed that I can sum the first, 1, and the last, 100, that's 101; and the second, 2, and the next-to-last, 99, and that's again 101; and clearly that repeats 50 times, so, 50 times 101, 5050". Not rigorous as proofs go, but quite correct and appropriate for a 6-years-old;-).
In the same way (plus really elementary algebra) you can see that the general case is, as many have already said, (N * (N+1)) / 2 (the product is always even, since one of the numbers must be odd and one even; so the division by two will always produce an integer, as desired, with no remainder).
Here is how to prove the closed form for an arithmetic progression
S = 1 + 2 + ... + (n-1) + n
S = n + (n-1) + ... + 2 + 1
2S = (n+1) + (n+1) + ... + (n+1) + (n+1)
^ you'll note that there are n terms there.
2S = n(n+1)
S = n(n+1)/2
I'm not allowed to comment yet so I'll just add that you'll want to be careful in using range() as it's 0 base. You'll need to use range(n+1) to get the desired effect.
Sorry for the duplication...
sum(range(10)) != 55
sum(range(11)) == 55
OP has asked, in a comment, for a link to the story about Gauss as a schoolchild.
He may want to check out this fascinating article by Brian Hayes. It not only rather convincingly suggests that the Gauss story may be a modern fabrication, but outlines how it would be rather difficult not to see the patterns involved in summing the numbers from 1 to 100. That in fact the only way to miss these patterns would be to solve the problem by writing a program.
The article also talks about different ways to sum arithmetic progressions, which is at the heart of OP's question. There is also an ad-free version here.
Larry is very correct with his formula, and its the fastest way to calculate the sum of all integers up to n.
But for completeness, there are built-in Python functions, that perform what you have done, on lists with arbitrary elements. E.g.
sum()
>>> sum(range(11))
55
>>> sum([2,4,6])
12
or more general, reduce()
>>> import operator
>>> reduce(operator.add, range(11))
55
Consider that N+1, N-1+2, N-2+3, and so on all add up to the same number, and there are approximately N/2 instances like that (exactly N/2 if N is even).
What you have there is called arithmetic sequence and as suggested, you can compute it directly without overhead which might result from the recursion.
And I would say this is a homework despite what you say.
Related
I'm looking for a pseudo-random number generator (an algorithm where you input a seed number and it outputs a different 'random-looking' number, and the same seed will always generate the same output) for numbers between 1 and 951,312,000.
I would use the Linear Feedback Shift Register (LFSR) PRNG, but if I did, I would have to convert the seed number (which could be up to 1.2 million digits long in base-10) into a binary number, which would be so massive that I think it would take too long to compute.
In response to a similar question, the Feistel cipher was recommended, but I didn't understand the vocabulary of the wiki page for that method (I'm going into 10th grade so I don't have a degree in encryption), so if you could use layman's terms, I would strongly appreciate it.
Is there an efficient way of doing this which won't take until the end of time, or is this problem impossible?
Edit: I forgot to mention that the prng sequence needs to have a full period. My mistake.
A simple way to do this is to use a linear congruential generator with modulus m = 95^1312000.
The formula for the generator is x_(n+1) = a*x_n + c (mod m). By the Hull-Dobell Theorem, it will have full period if and only if gcd(m,c) = 1 and 95 divides a-1. Furthermore, if you want good second values (right after the seed) even for very small seeds, a and c should be fairly large. Also, your code can't store these values as literals (they would be much too big). Instead, you need to be able to reliably produce them on the fly. After a bit of trial and error to make sure gcd(m,c) = 1, I hit upon:
import random
def get_book(n):
random.seed(1941) #Borges' Library of Babel was published in 1941
m = 95**1312000
a = 1 + 95 * random.randint(1, m//100)
c = random.randint(1, m - 1) #math.gcd(c,m) = 1
return (a*n + c) % m
For example:
>>> book = get_book(42)
>>> book % 10**100
4779746919502753142323572698478137996323206967194197332998517828771427155582287891935067701239737874
shows the last 100 digits of "book" number 42. Given Python's built-in support for large integers, the code runs surprisingly fast (it takes less than 1 second to grab a book on my machine)
If you have a method that can produce a pseudo-random digit, then you can concatenate as many together as you want. It will be just as repeatable as the underlying prng.
However, you'll probably run out of memory scaling that up to millions of digits and attempting to do arithmetic. Normally stuff on that scale isn't done on "numbers". It's done on byte vectors, or something similar.
Given positive integers b, c, m where (b < m) is True it is to find a positive integer e such that
(b**e % m == c) is True
where ** is exponentiation (e.g. in Ruby, Python or ^ in some other languages) and % is modulo operation. What is the most effective algorithm (with the lowest big-O complexity) to solve it?
Example:
Given b=5; c=8; m=13 this algorithm must find e=7 because 5**7%13 = 8
From the % operator I'm assuming that you are working with integers.
You are trying to solve the Discrete Logarithm problem. A reasonable algorithm is Baby step, giant step, although there are many others, none of which are particularly fast.
The difficulty of finding a fast solution to the discrete logarithm problem is a fundamental part of some popular cryptographic algorithms, so if you find a better solution than any of those on Wikipedia please let me know!
This isn't a simple problem at all. It is called calculating the discrete logarithm and it is the inverse operation to a modular exponentation.
There is no efficient algorithm known. That is, if N denotes the number of bits in m, all known algorithms run in O(2^(N^C)) where C>0.
Python 3 Solution:
Thankfully, SymPy has implemented this for you!
SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python.
This is the documentation on the discrete_log function. Use this to import it:
from sympy.ntheory import discrete_log
Their example computes \log_7(15) (mod 41):
>>> discrete_log(41, 15, 7)
3
Because of the (state-of-the-art, mind you) algorithms it employs to solve it, you'll get O(\sqrt{n}) on most inputs you try. It's considerably faster when your prime modulus has the property where p - 1 factors into a lot of small primes.
Consider a prime on the order of 100 bits: (~ 2^{100}). With \sqrt{n} complexity, that's still 2^{50} iterations. That being said, don't reinvent the wheel. This does a pretty good job. I might also add that it was almost 4x times more memory efficient than Mathematica's MultiplicativeOrder function when I ran with large-ish inputs (44 MiB vs. 173 MiB).
Since a duplicate of this question was asked under the Python tag, here is a Python implementation of baby step, giant step, which, as #MarkBeyers points out, is a reasonable approach (as long as the modulus isn't too large):
def baby_steps_giant_steps(a,b,p,N = None):
if not N: N = 1 + int(math.sqrt(p))
#initialize baby_steps table
baby_steps = {}
baby_step = 1
for r in range(N+1):
baby_steps[baby_step] = r
baby_step = baby_step * a % p
#now take the giant steps
giant_stride = pow(a,(p-2)*N,p)
giant_step = b
for q in range(N+1):
if giant_step in baby_steps:
return q*N + baby_steps[giant_step]
else:
giant_step = giant_step * giant_stride % p
return "No Match"
In the above implementation, an explicit N can be passed to fish for a small exponent even if p is cryptographically large. It will find the exponent as long as the exponent is smaller than N**2. When N is omitted, the exponent will always be found, but not necessarily in your lifetime or with your machine's memory if p is too large.
For example, if
p = 70606432933607
a = 100001
b = 54696545758787
then 'pow(a,b,p)' evaluates to 67385023448517
and
>>> baby_steps_giant_steps(a,67385023448517,p)
54696545758787
This took about 5 seconds on my machine. For the exponent and the modulus of those sizes, I estimate (based on timing experiments) that brute force would have taken several months.
Discrete logarithm is a hard problem
Computing discrete logarithms is believed to be difficult. No
efficient general method for computing discrete logarithms on
conventional computers is known.
I will add here a simple bruteforce algorithm which tries every possible value from 1 to m and outputs a solution if it was found. Note that there may be more than one solution to the problem or zero solutions at all. This algorithm will return you the smallest possible value or -1 if it does not exist.
def bruteLog(b, c, m):
s = 1
for i in xrange(m):
s = (s * b) % m
if s == c:
return i + 1
return -1
print bruteLog(5, 8, 13)
and here you can see that 3 is in fact the solution:
print 5**3 % 13
There is a better algorithm, but because it is often asked to be implemented in programming competitions, I will just give you a link to explanation.
as said the general problem is hard. however a prcatical way to find e if and only if you know e is going to be small (like in your example) would be just to try each e from 1.
btw e==3 is the first solution to your example, and you can obviously find that in 3 steps, compare to solving the non discrete version, and naively looking for integer solutions i.e.
e = log(c + n*m)/log(b) where n is a non-negative integer
which finds e==3 in 9 steps
I found some Python code that claims checking primality based on Fermat's little theorem:
def CheckIfProbablyPrime(x):
return (2 << x - 2) % x == 1
My questions:
How does it work?
What's its relation to Fermat's little theorem?
How accurate is this method?
If it's not accurate, what's the advantage of using it?
I found it here.
1. How does it work?
Fermat's little theorem says that if a number x is prime, then for any integer a:
If we divide both sides by a, then we can re-write the equation as follows:
I'm going to punt on proving how this works (your first question) because there are many good proofs (better than I can provide) on this wiki page and under some Google searches.
2. Relation between code and theorem
So, the function you posted checks if (2 << x - 2) % x == 1.
First off, (2 << x-2) is the same thing as writing 2**(x-1), or in math-form:
That's because << is the logical left-shift operator, which is explained better here. The relation between bit-shifting and multiplying by powers of 2 is specific to the way that numbers are represented on computers (in binary), but it all boils down to
I can subtract 1 from the exponent on both sides, which gives
Now, we know from above that for any number a,
Let's say then that a = 2. That gives us
Well heck, that's the same as 2 << (x-2)! So then we can write:
Which leads to the final relation:
Now, the math version of mod looks kind of odd, but we can write the equivalent code as follows:
(2 << x - 2) % x == 1
And that's the relation.
3. Accuracy of method
So, I think "accuracy" is a bad term here, because Fermat's little theorem is definitely true for all prime numbers. However, that does not mean that it's true or false for all numbers -- which is to say, if I have some number i, and I'm not sure if i is prime, using Fermat's Little Relation will only tell me if it is definitely NOT prime. If Fermat's Little Relation is true, then i could not be prime. These kinds of numbers are called pseudoprime numbers, or more specifically in this case Fermat Pseudoprime numbers.
If this sort of thing sounds interesting, take a look at the Carmichael numbers AKA the Absolute Fermat Pseudoprimes, which pass the Fermat test in any base but are not prime. In our case we run into numbers which pass in base 2, but Fermat's little theorem might not hold for these numbers in other bases -- the Carmichael numbers pass the test for all bases coprime to x.
On the wiki page of the Carmichael there is a discussion of their distribution over the range of natural numbers -- they appear exponentially with the size of the range over which you're looking, though the exponent is less than 1 (about 1/3). So, if you're searching for primes over a big range, you're going to run into exponentially more Carmichael numbers, which are effectively false positives for this method CheckIfProbablyPrime. That might be okay, depending on your input and how much you care about running into false positives.
4. Why is this useful?
In short, it's an optimization.
The main reason to use something like this is to speed up a search for prime numbers. That's because actually checking if a number is prime is expensive -- i.e. more than O(1) running time. Doable, but still more expensive than O(1) time. So, if we can avoid doing that actual check for some numbers, we'll be able to devote more time to checking actual candidates. Since Fermat's little relation will only say yes if a number is possibly prime (it will never say no if the number is prime), and it can be checked in O(1) time, we can toss it into an is_prime loop to ignore a fair amount of numbers. So, we can speed things up.
There are many primality checks like this one, you can find some coded prime checkers here
Final Note
One of the confusing things about this optimization is that it uses the bit shift operator << instead of the exponentiation operator **. This is because bit shifting is one of the fastest operations that your computer can do, while exponentiation is slower by some amount. It is not always the best optimization in many cases, because most modern languages know how to replace things we write with more optimized operations. But, that's my venture as to why the authors of this code used the bit shift instead of 2**(x-1).
Edit: As MarkDickinson notes, taking the exponent of a number and then modding it explicitly is not the best way to do it. This is a thing called modular exponentiation, and there exist algorithms which can do it faster than the way we've written it. Python's builtin pow actually implements one of these algorithms, and takes an optional third argument to mod by. So we can write a final version of this function:
def CheckIfProbablyPrime(x):
return pow(2, x-1, x) == 1
Which is not only more readable but also faster than the confusing bit-shift crap. You know what they say.
I believe, the code in your example is incorrect because binary left shift operator is not equivalent to power of a number, which is used in Fermat's little theorem. With base of two, binary left shift would be equal to power of x + 1, which is NOT used in a version of Fermat's little format.
Instead, use ** for power of integer in Python.
def CheckIfProbablyPrime(x):
return (2 ** x - 2) % x == 0
" p − a is an integer multiple of p " therefore for primes, following theorem, result of 2 in power of x - 2 divided by x will leave a leftover of 0 (modulo '%' checks for number left over after division.
For x - 1 version,
def CheckIfProbablyPrime(a, x):
return (a ** (x-1) - 1) % x == 0
both variations should result as true for prime numbers, because they're representing the Fermat's little theorem in Python
Lately I've been solving some challenges from Google Foobar for fun, and now I've been stuck in one of them for more than 4 days. It is about a recursive function defined as follows:
R(0) = 1
R(1) = 1
R(2) = 2
R(2n) = R(n) + R(n + 1) + n (for n > 1)
R(2n + 1) = R(n - 1) + R(n) + 1 (for n >= 1)
The challenge is writing a function answer(str_S) where str_S is a base-10 string representation of an integer S, which returns the largest n such that R(n) = S. If there is no such n, return "None". Also, S will be a positive integer no greater than 10^25.
I have investigated a lot about recursive functions and about solving recurrence relations, but with no luck. I outputted the first 500 numbers and I found no relation with each one whatsoever. I used the following code, which uses recursion, so it gets really slow when numbers start getting big.
def getNumberOfZombits(time):
if time == 0 or time == 1:
return 1
elif time == 2:
return 2
else:
if time % 2 == 0:
newTime = time/2
return getNumberOfZombits(newTime) + getNumberOfZombits(newTime+1) + newTime
else:
newTime = time/2 # integer, so rounds down
return getNumberOfZombits(newTime-1) + getNumberOfZombits(newTime) + 1
The challenge also included some test cases so, here they are:
Test cases
==========
Inputs:
(string) str_S = "7"
Output:
(string) "4"
Inputs:
(string) str_S = "100"
Output:
(string) "None"
I don't know if I need to solve the recurrence relation to anything simpler, but as there is one for even and one for odd numbers, I find it really hard to do (I haven't learned about it in school yet, so everything I know about this subject is from internet articles).
So, any help at all guiding me to finish this challenge will be welcome :)
Instead of trying to simplify this function mathematically, I simplified the algorithm in Python. As suggested by #LambdaFairy, I implemented memoization in the getNumberOfZombits(time) function. This optimization sped up the function a lot.
Then, I passed to the next step, of trying to see what was the input to that number of rabbits. I had analyzed the function before, by watching its plot, and I knew the even numbers got higher outputs first and only after some time the odd numbers got to the same level. As we want the highest input for that output, I first needed to search in the even numbers and then in the odd numbers.
As you can see, the odd numbers take always more time than the even to reach the same output.
The problem is that we could not search for the numbers increasing 1 each time (it was too slow). What I did to solve that was to implement a binary search-like algorithm. First, I would search the even numbers (with the binary search like algorithm) until I found one answer or I had no more numbers to search. Then, I did the same to the odd numbers (again, with the binary search like algorithm) and if an answer was found, I replaced whatever I had before with it (as it was necessarily bigger than the previous answer).
I have the source code I used to solve this, so if anyone needs it I don't mind sharing it :)
The key to solving this puzzle was using a binary search.
As you can see from the sequence generators, they rely on a roughly n/2 recursion, so calculating R(N) takes about 2*log2(N) recursive calls; and of course you need to do it for both the odd and the even.
Thats not too bad, but you need to figure out where to search for the N which will give you the input. To do this, I first implemented a search for upper and lower bounds for N. I walked up N by powers of 2, until I had N and 2N that formed the lower and upper bounds respectively for each sequence (odd and even).
With these bounds, I could then do a binary search between them to quickly find the value of N, or its non-existence.
What is the difference between the following statements?
Statement 1:
var=2**2*3
Statement 2:
var2=2*2*3
I see no difference.
This raises the following question.
Why is Statement 1 used if we can use Statement 2?
Try:
2**3*2
and
2*3*2
to see the difference.
** is the operator for "power of". In your particular operation, 2 to the power of 2 yields the same as 2 times 2.
Double stars (**) are exponentiation. So "2 times 2" and "2 to the power 2" are the same. Change the numbers and you'll see a difference.
2**2 means 2 squared (2^2)
2*2 mean 2 times 2 (2x2)
In this case they happen to have the same value, but...
3**3*4 != 3*3*4
For visual learners.........................
To specifically answer your question Why is the code1 used if we can use code2? I might suggest that the programmer was thinking in a mathematically broader sense. Specifically, perhaps the broader equation is a power equation, and the fact that both first numbers are "2" is more coincidence than mathematical reality. I'd want to make sure that the broader context of the code supports it being var = x * x * y in all cases, rather than in this specific case alone. This could get you in big trouble if x is anything but 2.
2**2 = 2 power-of 2
2*2 = 2 times 2
The ** operator in Python is really "power;" that is, 2**3 = 8.
The top one is a "power" operator, so in this case it is the same as 2 * 2 equal to is 2 to the power of 2. If you put a 3 in the middle position, you will see a difference.
A double asterisk means to the power of. A single asterisk means multiplied by. 22 is the same as 2x2 which is why both answers came out as 4.
Power has more precedence than multiply, so:
2**2*3 = (2^2)*3
2*2*3 = 2*2*3