calculate catalan numbers up to a billion - python

I'm new to python (and programming in general), I was asked in my class to calculate Catalan numbers up to a billion but the program I wrote for it is not working as intended.
from numpy import division
C=1
n=0
while C<=10000000000:
print (C)
C=(4*n+2)/(n+2)*C
n=n+1
This is what it prints
1,
1,
2,
4,
8,
24,
72,
216,
648,
1944,
5832,
17496,
52488,
157464,
472392,
1417176,
4251528,
12754584,
38263752,
114791256,
344373768,
1033121304,
3099363912,
9298091736,
As you can see from my fourth iteration onwards, I get the wrong number and I don't understand why.
EDIT:
The mathematical definition I used is not wrong! I know the Wiki has another definition but this one is not wrong.
Co=1, Cn+1 = (4*n+2)/(n+2)*Cn

C=(4*n+2)/(n+2)*C
This applies the calculation in the wrong order. Because you are using integer arithmetic, (4*n+2)/(n+2) loses information if you have not already factored in C. The correct calculation is:
C=C*(4*n+2)/(n+2)

Try this:
from scipy.special import factorial
C = 1
n = 0
while C <= 10000000000:
print(C)
C = factorial(2*n, exact=True)/(factorial((n+1), exact=True)*factorial(n, exact=True))
n = n + 1
It works for me :)

This is solved using recursion:
def catalan(n):
if n <=1 :
return 1
res = 0
for i in range(n):
res += catalan(i) * catalan(n-i-1)
return res
for i in range(10000000000):
print (catalan(i))
you can read more about Catalan numbers here or here

Based on this expression for Catalan Numbers:
from math import factorial
C = 1
n = 0
while C <= 10000000000:
print(C)
C = (factorial(2 * n)) / (factorial(n + 1) * factorial(n))
n = n + 1
Returns:
1
1.0
1.0
2.0
5.0
14.0
42.0
132.0
429.0
1430.0
4862.0
16796.0
58786.0
208012.0
742900.0
2674440.0
9694845.0
35357670.0
129644790.0
477638700.0
1767263190.0
6564120420.0

The problem
Your mathematical definition of Catalan numbers is incorrect when translated into code.
This is because of operator precedence in programming languages such as Python.
Multiplication and division both have the same precedence, so they are computed left to right. What happens is that the division operation (4*n+2)/(n+2) happens before the multiplication with C. When n is 2, 2*(2*n+2)/(n+2) becomes 10/4 which is 2 in integer arithmetic. 1*C which is 2 at this stage, gives 4 instead of giving the expected 5.
Once a number in the series is incorrect, being computed iteratively is incorrect.
A possible work around
Here's the definition Catalan Numbers
Which means, the nth Catalan number is given by:
import operator as op
def ncr(n, r):
r = min(r, n-r)
if r == 0: return 1
numer = reduce(op.mul, xrange(n, n-r, -1))
denom = reduce(op.mul, xrange(1, r+1))
return numer//denom
def catalan(n):
return ncr(2*n, n)/(n+1)
This is not very efficient, but it is at least correct.
The right fix
To compute the series, you can do, using the recursive formula.
N=1000000 # limit
C=1
for i in xrange(0, N+1):
print i,C
C = (2*(2*i +1)*C)/(i+2)
For the first few, it looks like this:
0 1
1 1
2 2
3 5
4 14
5 42
6 132
7 429
8 1430
9 4862
10 16796

Related

More efficient way to calculate the nth term

I have the recurrence relation : (n-2)an = 2(4n-9)an-1 - (15n-38)an-2 - 2(2n-5)an-3 with initial conditions being a0 = 0, a1 = 1 and a2 = 3. I mainly want to calculate an mod n and 2n mod n for all odd composite numbers n from 1 up to say 2.5 million.
I have written down a code in Python. Using sympy and memoization, I did the computation for an mod n but it took it more than 2 hours. It got worse when I tried it for a2n mod n. One main reason for the slowness is that the recurrence has non-constant coefficients. Are there more efficient codes that I could use? Or would it help to do this on some other language (which preferably should have an in-built function or a function from some package that can be used directly for the primality testing part of the code)?
This is my code.
from functools import lru_cache
import sympy
#lru_cache(maxsize = 1000)
def f(n):
if n==0:
return 0
elif n==1:
return 1
elif n==2:
return 3
else:
return ((2*((4*n)-9)*f(n-1)) - (((15*n)-38)*f(n-2)) - (2*((2*n)-5)*f(n-3)))//(n-2)
for n in range(1,2500000,2):
if sympy.isprime(n)==False:
print(n,f(n)%n)
if n%10000==1:
print(n,'check')
The last 'if' statement is just to check how much progress is being made.
For a somewhat faster approach avoiding any memory issues, you could calculate the an directly in sequence, while always retaining only the last three values in a queue:
from collections import deque
a = deque([0, 1, 3])
for n in range(3, 2_500_000):
a.append(((8 * n - 18) * a[2]
- (15 * n - 38) * a[1]
- (4 * n - 10) * a.popleft())
// (n - 2))
if n % 2 == 1:
print(n, a[2] % n)
3 2
5 0
7 6
9 7
11 1
[...]
2499989 1
2499991 921156
2499993 1210390
2499995 1460120
2499997 2499996
2499999 1195814
This took about 50 minutes on my PC. Note I avoided the isprime() call in view of Rodrigo's comment.

How to check if abc == sqrt(a^b^c) very fast (preferably Python)?

Let a,b,c be the first digits of a number (e.g. 523 has a=5, b=2, c=3). I am trying to check if abc == sqrt(a^b^c) for many values of a,b,c. (Note: abc = 523 stands for the number itself.)
I have tried this with Python, but for a>7 it already took a significant amount of time to check just one digit combination. I have tried rewriting the equality as multiple logs, like log_c[log_b[log_a[ (abc)^2 ]]] == 1, however, I encountered Math Domain Errors.
Is there a fast / better way to check this equality (preferably in Python)?
Note: Three digits are an example for StackOverflow. The goal is to test much higher powers with seven to ten digits (or more).
Here is the very basic piece of code I have used so far:
for a in range(1,10):
for b in range(1,10):
for c in range(1,10):
N = a*10**2 + b*10 + c
X = a**(b**c)
if N == X:
print a,b,c
The problem is that you are uselessly calculating very large integers, which can take much time as Python has unlimited size for them.
You should limit the values of c you test.
If your largest possible number is 1000, you want a**b**c < 1000**2, so b**c < log(1000**2, a) = 2*log(1000, a)), so c < log(2*log(1000, a), b)
Note that you should exclude a = 1, as any power of it is 1, and b = 1, as b^c would then be 1, and the whole expression is just a.
To test if the square root of a^b^c is abc, it's better to test if a^b^c is equal to the square of abc, in order to avoid using floats.
So, the code, that (as expected) doesn't find any solution under 1000, but runs very fast:
from math import log
for a in range(2,10):
for b in range(2,10):
for c in range(1,int(log(2*log(1000, a), b))):
N2 = (a*100 + b*10 + c)**2
X = a**(b**c)
if N2 == X:
print(a,b,c)
You are looking for numbers whose square root is equal to a three-digit integer. That means your X has to have at most 6 digits, or more precisely log10(X) < 6. Once your a gets larger, the potential solutions you're generating are much larger than that, so we can eliminate large swathes of them without needing to check them (or needing to calculate a ** b ** c, which can get very large: 9 ** 9 ** 9 has 369_693_100 DIGITS!).
log10(X) < 6 gives us log10(a ** b ** c) < 6 which is the same as b ** c * log10(a) < 6. Bringing it to the other side: log10(a) < 6 / b ** c, and then a < 10 ** (6 / b ** c). That means I know I don't need to check for any a that exceeds that. Correcting for an off-by-one error gives the solution:
for b in range(1, 10):
for c in range(1, 10):
t = b ** c
for a in range(1, 1 + min(9, int(10 ** (6 / t)))):
N = a * 100 + b * 10 + c
X = a ** t
if N * N == X:
print(a, b, c)
Running this shows that there aren't any valid solutions to your equation, sadly!
a**(b**c) will grow quite fast and most of the time it will far exceed three digit number. Most of the calculations you are doing will be useless. To optimize your solution do the following:
Iterate over all 3 digit numbers
For each of these numbers square it and is a power of the first digit of the number
For those that are, check if this power is in turn a power of the second digit
And last check if this power is the third digit

Prime number generation using Fibonacci possible?

I'm generating prime numbers from Fibonacci as follows (using Python, with mpmath and sympy for arbitrary precision):
from mpmath import *
def GCD(a,b):
while a:
a, b = fmod(b, a), a
return b
def generate(x):
mp.dps = round(x, int(log10(x))*-1)
if x == GCD(x, fibonacci(x-1)):
return True
if x == GCD(x, fibonacci(x+1)):
return True
return False
for x in range(1000, 2000)
if generate(x)
print(x)
It's a rather small algorithm but seemingly generates all primes (except for 5 somehow, but that's another question). I say seemingly because a very little percentage (0.5% under 1000 and 0.16% under 10K, getting less and less) isn't prime. For instance under 1000: 323, 377 and 442 are also generated. These numbers are not prime.
Is there something off in my script? I try to account for precision by relating the .dps setting to the number being calculated. Can it really be that Fibonacci and prime numbers are seemingly so related, but then when it's get detailed they aren't? :)
For this type of problem, you may want to look at the gmpy2 library. gmpy2 provides access to the GMP multiple-precision library which includes gcd() and fib() functions which calculate the greatest common divisor and the n-th fibonacci numbers quickly, and only using integer arithmetic.
Here is your program re-written to use gmpy2.
import gmpy2
def generate(x):
if x == gmpy2.gcd(x, gmpy2.fib(x-1)):
return True
if x == gmpy2.gcd(x, gmpy2.fib(x+1)):
return True
return False
for x in range(7, 2000):
if generate(x):
print(x)
You shouldn't be using any floating-point operations. You can calculate the GCD just using the builtin % (modulo) operator.
Update
As others have commented, you are checking for Fibonacci pseudoprimes. The actual test is slightly different than your code. Let's call the number being tested n. If n is divisible by 5, then the test passes if n evenly divides fib(n). If n divided by 5 leaves a remainder of either 1 or 4, then the test passes if n evenly divides fib(n-1). If n divided by 5 leaves a remainder of either 2 or 3, then the test passes if n evenly divides fib(n+1). Your code doesn't properly distinguish between the three cases.
If n evenly divides another number, say x, it leaves a remainder of 0. This is equivalent to x % n being 0. Calculating all the digits of the n-th Fibonacci number is not required. The test just cares about the remainder. Instead of calculating the Fibonacci number to full precision, you can calculate the remainder at each step. The following code calculates just the remainder of the Fibonacci numbers. It is based on the code given by #pts in Python mpmath not arbitrary precision?
def gcd(a,b):
while b:
a, b = b, a % b
return a
def fib_mod(n, m):
if n < 0:
raise ValueError
def fib_rec(n):
if n == 0:
return 0, 1
else:
a, b = fib_rec(n >> 1)
c = a * ((b << 1) - a)
d = b * b + a * a
if n & 1:
return d % m, (c + d) % m
else:
return c % m, d % m
return fib_rec(n)[0]
def is_fib_prp(n):
if n % 5 == 0:
return not fib_mod(n, n)
elif n % 5 == 1 or n % 5 == 4:
return not fib_mod(n-1, n)
else:
return not fib_mod(n+1, n)
It's written in pure Python and is very quick.
The sequence of numbers commonly known as the Fibonacci numbers is just a special case of a general Lucas sequence L(n) = p*L(n-1) - q*L(n-2). The usual Fibonacci numbers are generated by (p,q) = (1,-1). gmpy2.is_fibonacci_prp() accepts arbitrary values for p,q. gmpy2.is_fibonacci(1,-1,n) should match the results of the is_fib_pr(n) given above.
Disclaimer: I maintain gmpy2.
This isn't really a Python problem; it's a math/algorithm problem. You may want to ask it on the Math StackExchange instead.
Also, there is no need for any non-integer arithmetic whatsoever: you're computing floor(log10(x)) which can be done easily with purely integer math. Using arbitrary-precision math will greatly slow this algorithm down and may introduce some odd numerical errors too.
Here's a simple floor_log10(x) implementation:
from __future__ import division # if using Python 2.x
def floor_log10(x):
res = 0
if x < 1:
raise ValueError
while x >= 1:
x //= 10
res += 1
return res

What is the best way to generate Pythagorean triples?

I have tried with that simple code when you just check all the combinations for a and b and then check if square root of c is an integer, but that code is really slow, then I have tried with Euclid's formula
a = d*(n^2 - m^2)
b = 2*n*m*d
c = d*(n^2 + m^2)
and I have written a code where you first find n with
trunc(sqrt(max_value))
//this is in pascal
and then you check every combination of 0 < m < n but I get duplicate results, like if n is 7, m is 5 and d is 1, and n is 6, m is 1 and d is 2 . In both cases you get 24, 70 and 74. So what is a good fast way to calculate the number of Pythagorean triples, I can't seem to find a way, also if I add all results to an array, and then check the array for duplicates, it just takes too much time... If anyone can help me with the code it can be pascal, c or python, I can understand all...
The Wikipedia page on Pythagorean triples gives us a hint:
The triple generated by Euclid's formula is primitive if and only if m and n are coprime and m − n is odd. If both m and n are odd, then a, b, and c will be even, and so the triple will not be primitive; however, dividing a, b, and c by 2 will yield a primitive triple if m and n are coprime
If you restrict m and n to coprime numbers and force m - n to be odd you will uiniquely generate all the primitive pythagorean triples. From this point on, you should be able to multiply these unique triples by factors of d to uniquely generate all triples.
In your example, allowing n=7 and m=5 was the problem, because their difference was even and the triple they generated was not primitive (you could divide all sides by 2 to get a smaller triple)
I was curious so I decided to try this. I found that this algorithm was pretty easy to implement in Python and works pretty fast:
import math
def pythagorean_triples(n):
a, b, c = 1, 3, 0
while c < n:
a_ = (a * b) + a
c = math.sqrt(a_**2 + b**2)
if c == int(c):
yield b, a_, int(c)
a += 1
b += 2
if __name__ == '__main__':
import sys
for pt in pythagorean_triples(int(sys.argv[1])):
print(pt)
Try it by copying that script into pythagorean_triples.py and running python3 pythagorean_triples.py n where n is the maximum c you want it to generate. (You can use later Python2 if you like as well.)

Writing a simple function using while

A Python HOMEWORK Assignment asks me to write a function “that takes as input a positive whole number, and prints out a multiplication, table showing all the whole number multiplications up to and including the input number.”(Also using the while loop)
# This is an example of the output of the function
print_multiplication_table(3)
>>> 1 * 1 = 1
>>> 1 * 2 = 2
>>> 1 * 3 = 3
>>> 2 * 1 = 2
>>> 2 * 2 = 4
>>> 2 * 3 = 6
>>> 3 * 1 = 3
>>> 3 * 2 = 6
>>> 3 * 3 = 9
I know how to start, but don’t know what to do next. I just need some help with the algorithm. Please DO NOT WRITE THE CORRECT CODE, because I want to learn. Instead tell me the logic and reasoning.
Here is my reasoning:
The function should multiply all real numbers to the given value(n) times 1 less than n or (n-1)
The function should multiply all real numbers to n(including n) times two less than n or (n-2)
The function should multiply all real numbers to n(including n) times three less than n or (n-3) and so on... until we reach n
When the function reaches n, the function should also multiply all real numbers to n(including n) times n
The function should then stop or in the while loop "break"
Then the function has to print the results
So this is what I have so far:
def print_multiplication_table(n): # n for a number
if n >=0:
while somehting:
# The code rest of the code that I need help on
else:
return "The input is not a positive whole number.Try anohter input!"
Edit: Here's what I have after all the wonderful answers from everyone
"""
i * j = answer
i is counting from 1 to n
for each i, j is counting from 1 to n
"""
def print_multiplication_table(n): # n for a number
if n >=0:
i = 0
j = 0
while i <n:
i = i + 1
while j <i:
j = j + 1
answer = i * j
print i, " * ",j,"=",answer
else:
return "The input is not a positive whole number.Try another input!"
It's still not completely done!
For example:
print_multiplication_table(2)
# The output
>>>1 * 1 = 1
>>>2 * 2 = 4
And NOT
>>> 1 * 1 = 1
>>> 1 * 2 = 2
>>> 2 * 1 = 2
>>> 2 * 2 = 4
What am I doing wrong?
I'm a little mad about the while loop requirement, because for loops are better suited for this in Python. But learning is learning!
Let's think. Why do a While True? That will never terminate without a break statement, which I think is kind of lame. How about another condition?
What about variables? I think you might need two. One for each number you want to multiply. And make sure you add to them in the while loop.
I'm happy to add to this answer if you need more help.
Your logic is pretty good. But here's a summary of mine:
stop the loop when the product of the 2 numbers is n * n.
In the mean time, print each number and their product. If the first number isn't n, increment it. Once that's n, start incrementing the second one. (This could be done with if statements, but nested loops would be better.) If they're both n, the while block will break because the condition will be met.
As per your comment, here's a little piece of hint-y psuedocode:
while something:
while something else:
do something fun
j += 1
i += 1
where should original assignment of i and j go? What is something, something else, and something fun?
This problem is better implemented using nested loops since you have two counters. First figure out the limits (start, end values) for the two counters. Initialize your counters to lower limits at the beginning of the function, and test the upper limits in the while loops.
The first step towards being able to produce a certain output is to recognize the pattern in that output.
1 * 1 = 1
1 * 2 = 2
1 * 3 = 3
2 * 1 = 2
2 * 2 = 4
2 * 3 = 6
3 * 1 = 3
3 * 2 = 6
3 * 3 = 9
The number on the right of = should be trivial to determine, since we can calculate it by multiplying the other two numbers on each row; obtaining those is the core of the assignment. Think of the two operands of * as two counters, let's call them i and j. We can see that i is counting from 1 to 3, but for each i, j is counting from 1 to 3 (resulting in a total of 9 rows; more generally there will be n2 rows). Therefore, you might try using nested loops, one to loop i (from 1 to n) and another to loop j (from 1 to n) for each i. On each iteration of the nested loop, you can print the string containing i, j and i*j in the desired format.

Categories

Resources