I need to create function which will take one argument int and output int which represents the number of distinct parts of input integer's partition. Namely,
input:3 -> output: 1 -> {1, 2}
input:6 -> output: 3 -> {1, 2, 3}, {2, 4}, {1, 5}
...
Since I am looking only for distinct parts, something like this is not allowed:
4 -> {1, 1, 1, 1} or {1, 1, 2}
So far I have managed to come up with some algorithms which would find every possible combination, but they are pretty slow and effective only until n=100 or so.
And since I only need number of combinations not the combinations themselves Partition Function Q should solve the problem.
Does anybody know how to implement this efficiently?
More information about the problem: OEIS, Partition Function Q
EDIT:
To avoid any confusion, the DarrylG answer also includes the trivial (single) partition, but this does not affect the quality of it in any way.
EDIT 2:
The jodag (accepted answer) does not include trivial partition.
Tested two algorithms
Simple recurrence relation
WolframMathword algorithm (based upon Georgiadis, Kediaya, Sloane)
Both implemented with Memoization using LRUCache.
Results: WolframeMathword approach orders of magnitude faster.
1. Simple recurrence relation (with Memoization)
Reference
Code
#lru_cache(maxsize=None)
def p(n, d=0):
if n:
return sum(p(n-k, n-2*k+1) for k in range(1, n-d+1))
else:
return 1
Performance
n Time (sec)
10 time elapsed: 0.0020
50 time elapsed: 0.5530
100 time elapsed: 8.7430
200 time elapsed: 168.5830
2. WolframMathword algorithm
(based upon Georgiadis, Kediaya, Sloane)
Reference
Code
# Implementation of q recurrence
# https://mathworld.wolfram.com/PartitionFunctionQ.html
class PartitionQ():
def __init__(self, MAXN):
self.MAXN = MAXN
self.j_seq = self.calc_j_seq(MAXN)
#lru_cache
def q(self, n):
" Q strict partition function "
assert n < self.MAXN
if n == 0:
return 1
sqrt_n = int(sqrt(n)) + 1
temp = sum(((-1)**(k+1))*self.q(n-k*k) for k in range(1, sqrt_n))
return 2*temp + self.s(n)
def s(self, n):
if n in self.j_seq:
return (-1)**self.j_seq[n]
else:
return 0
def calc_j_seq(self, MAX_N):
""" Used to determine if n of form j*(3*j (+/-) 1) / 2
by creating a dictionary of n, j value pairs "
result = {}
j = 0
valn = -1
while valn <= MAX_N:
jj = 3*j*j
valp, valn = (jj - j)//2, (jj+j)//2
result[valp] = j
result[valn] = j
j += 1
return result
Performance
n Time (sec)
10 time elapsed: 0.00087
50 time elapsed: 0.00059
100 time elapsed: 0.00125
200 time elapsed: 0.10933
Conclusion: This algorithm is orders of magnitude faster than the simple recurrence relationship
Algorithm
Reference
I think a straightforward and efficient way to solve this is to explicitly compute the coefficient of the generating function from the Wolfram PartitionsQ link in the original post.
This is a pretty illustrative example of how to construct generating functions and how they can be used to count solutions. To start, we recognize that the problem may be posed as follows:
Let m_1 + m_2 + ... + m_{n-1} = n where m_j = 0 or m_j = j for all j.
Q(n) is the number of solutions of the equation.
We can find Q(n) by constructing the following polynomial (i.e. the generating function)
(1 + x)(1 + x^2)(1 + x^3)...(1 + x^(n-1))
The number of solutions is the number of ways the terms combine to make x^n, i.e. the coefficient of x^n after expanding the polynomial. Therefore, we can solve the problem by simply performing the polynomial multiplication.
def Q(n):
# Represent polynomial as a list of coefficients from x^0 to x^n.
# G_0 = 1
G = [int(g_pow == 0) for g_pow in range(n + 1)]
for k in range(1, n):
# G_k = G_{k-1} * (1 + x^k)
# This is equivalent to adding G shifted to the right by k to G
# Ignore powers greater than n since we don't need them.
G = [G[g_pow] if g_pow - k < 0 else G[g_pow] + G[g_pow - k] for g_pow in range(n + 1)]
return G[n]
Timing (average of 1000 iterations)
import time
print("n Time (sec)")
for n in [10, 50, 100, 200, 300, 500, 1000]:
t0 = time.time()
for i in range(1000):
Q(n)
elapsed = time.time() - t0
print('%-5d%.08f'%(n, elapsed / 1000))
n Time (sec)
10 0.00001000
50 0.00017500
100 0.00062900
200 0.00231200
300 0.00561900
500 0.01681900
1000 0.06701700
You can memoize the recurrences in equations 8, 9, and 10 in the mathematica article you linked for a quadratic in N runtime.
def partQ(n):
result = []
def rec(part, tgt, allowed):
if tgt == 0:
result.append(sorted(part))
elif tgt > 0:
for i in allowed:
rec(part + [i], tgt - i, allowed - set(range(1, i + 1)))
rec([], n, set(range(1, n)))
return result
The work is done by the rec internal function, which takes:
part - a list of parts whose sum is always equal to or less than the target n
tgt - the remaining partial sum that needs to be added to the sum of part to get to n
allowed - a set of number still allowed to be used in the full partitioning
When tgt = 0 is passed, that meant the sum of part if n, and the part is added to the result list. If tgt is still positive, each of the allowed numbers is attempted as an extension of part, in a recursive call.
Does the following algorithm have a complexity of O(nlogn)?
The thing that confuses me is that this algorithm divides twice, not once as a regular O(nlogn) algorithm, and each time it does O(n) work.
def equivalent(a, b):
if isEqual(a, b):
return True
half = int(len(a) / 2)
if 2*half != len(a):
return False
if (equivalent(a[:half], b[:half]) and equivalent(a[half:], b[half:])):
return True
if (equivalent(a[:half], b[half:]) and equivalent(a[half:], b[:half])):
return True
return False
Each of the 4 recursive calls to equivalent reduces the amount of input data by a factor of 2. Thus, assuming that a and b have the same length, and isEqual has linear time complexity, we can construct the recurrence relation for the overall complexity:
Where C is some constant. We can solve this relation by repeatedly substituting and spotting a pattern:
What is the upper limit of the summation, m? The stopping condition occurs when len(a) is odd. That may be anywhere between N and 1, depending on the prime decomposition of N. In the worse case scenario, N is a power of 2, so the function recurses until len(a) = 1, i.e.
To enhance the above answer, there is a direct way to calculate with 'Master Method'. The master method works only for following type of recurrences.
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1
We have three cases based on the f(n) as below and reduction for them:
If f(n) = Θ(nc) where c < Logba then T(n) = Θ(n Logba)
If f(n) = Θ(nc) where c = Logba then T(n) = Θ(nc Log n)
If f(n) = Θ(nc) where c > Logba then T(n) = Θ(f(n)) = Θ(nc)
In your case,
we have a = 4, b = 2, c = 1 and c < Logba
i.e. 1 < log24
Hence => case 1
Therefore:
T(n) = Θ(nLogba)
T(n) = Θ(nLog24)
T(n) = Θ(n2)
More details with examples can be found in wiki.
Hope it helps!
I saw a comment on Google+ a few weeks ago in which someone demonstrated a straight-forward computation of Fibonacci numbers which was not based on recursion and didn't use memoization. He effectively just remembered the last 2 numbers and kept adding them. This is an O(n) algorithm, but he implemented it very cleanly. So I quickly pointed out that a quicker way is to take advantage of the fact that they can be computed as powers of [[0,1],[1,1]] matrix and it requires only a O(log(N)) computation.
The problem, of course is that this is far from optimal past a certain point. It is efficient as long as the numbers are not too large, but they grow in length at the rate of N*log(phi)/log(10), where N is the Nth Fibonacci number and phi is the golden ratio ( (1+sqrt(5))/2 ~ 1.6 ). As it turns out, log(phi)/log(10) is very close to 1/5. So Nth Fibonacci number can be expected to have roughly N/5 digits.
Matrix multiplication, heck even number multiplication, gets very slow when the numbers start to have millions or billions of digits. So the F(100,000) took about .03 seconds to compute (in Python), while F(1000,000) took roughly 5 seconds. This is hardly O(log(N)) growth. My estimate was that this method, without improvements, only optimizes the computation to be O( (log(N)) ^ (2.5) ) or so.
Computing a billionth Fibonacci number, at this rate, would be prohibitively slow (even though it would only have ~ 1,000,000,000 / 5 digits so it easily fits within 32-bit memory).
Does anyone know of an implementation or algorithm which would allow a faster computation? Perhaps something which would allow calculation of a trillionth Fibonacci number.
And just to be clear, I am not looking for an approximation. I am looking for the exact computation (to the last digit).
Edit 1: I am adding the Python code to show what I believe is O( (log N) ^ 2.5) ) algorithm.
from operator import mul as mul
from time import clock
class TwoByTwoMatrix:
__slots__ = "rows"
def __init__(self, m):
self.rows = m
def __imul__(self, other):
self.rows = [[sum(map(mul, my_row, oth_col)) for oth_col in zip(*other.rows)] for my_row in self.rows]
return self
def intpow(self, i):
i = int(i)
result = TwoByTwoMatrix([[long(1),long(0)],[long(0),long(1)]])
if i <= 0:
return result
k = 0
while i % 2 == 0:
k +=1
i >>= 1
multiplier = TwoByTwoMatrix(self.rows)
while i > 0:
if i & 1:
result *= multiplier
multiplier *= multiplier # square it
i >>= 1
for j in xrange(k):
result *= result
return result
m = TwoByTwoMatrix([[0,1],[1,1]])
t1 = clock()
print len(str(m.intpow(100000).rows[1][1]))
t2 = clock()
print t2 - t1
t1 = clock()
print len(str(m.intpow(1000000).rows[1][1]))
t2 = clock()
print t2 - t1
Edit 2:
It looks like I didn't account for the fact that len(str(...)) would make a significant contribution to the overall runtime of the test. Changing tests to
from math import log as log
t1 = clock()
print log(m.intpow(100000).rows[1][1])/log(10)
t2 = clock()
print t2 - t1
t1 = clock()
print log(m.intpow(1000000).rows[1][1])/log(10)
t2 = clock()
print t2 - t1
shortened the runtimes to .008 seconds and .31 seconds (from .03 seconds and 5 seconds when len(str(...)) were used).
Because M=[[0,1],[1,1]] raised to power N is [[F(N-2), F(N-1)], [F(N-1), F(N)]],
the other obvious source of inefficiency was calculating (0,1) and (1,0) elements of the matrix as if they were distinct. This (and I switched to Python3, but Python2.7 times are similar):
class SymTwoByTwoMatrix():
# elments (0,0), (0,1), (1,1) of a symmetric 2x2 matrix are a, b, c.
# b is also the (1,0) element because the matrix is symmetric
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __imul__(self, other):
# this multiplication does work correctly because we
# are multiplying powers of the same symmetric matrix
self.a, self.b, self.c = \
self.a * other.a + self.b * other.b, \
self.a * other.b + self.b * other.c, \
self.b * other.b + self.c * other.c
return self
def intpow(self, i):
i = int(i)
result = SymTwoByTwoMatrix(1, 0, 1)
if i <= 0:
return result
k = 0
while i % 2 == 0:
k +=1
i >>= 1
multiplier = SymTwoByTwoMatrix(self.a, self.b, self.c)
while i > 0:
if i & 1:
result *= multiplier
multiplier *= multiplier # square it
i >>= 1
for j in range(k):
result *= result
return result
calculated F(100,000) in .006, F(1,000,000) in .235 and F(10,000,000) in 9.51 seconds.
Which is to be expected. It is producing results 45% faster for the fastest test and it is expected that the gain should asymptotically approach
phi/(1+2*phi+phi*phi) ~ 23.6%.
The (0,0) element of M^N is actually N-2nd Fibonacci number:
for i in range(15):
x = m.intpow(i)
print([x.a,x.b,x.c])
gives
[1, 0, 1]
[0, 1, 1]
[1, 1, 2]
[1, 2, 3]
[2, 3, 5]
[3, 5, 8]
[5, 8, 13]
[8, 13, 21]
[13, 21, 34]
[21, 34, 55]
[34, 55, 89]
[55, 89, 144]
[89, 144, 233]
[144, 233, 377]
[233, 377, 610]
I would expect that not having to calculate element (0,0) would produce a speed up of additional 1/(1+phi+phi*phi) ~ 19%. But the lru_cache of F(2N) and F(2N-1) solution given by Eli Korvigo below actually gives a speed up of 4 times (ie, 75%). So, while I have not worked out a formal explanation, I am tempted to think that it caches the spans of 1's within the binary expansion of N and does the minimum number of multiplications necessary. Which obviates the need to find those ranges, precompute them and then multiply them at the right point in the expansion of N. lru_cache allows for a top-to-bottom computation of what would have been a more complicated buttom-to-top computation.
Both SymTwoByTwoMatrix and the lru_cache-of-F(2N)-and-F(2N-1) are taking roughly 40 times longer to compute every time N grows 10 times. I think that's possibly due to Python's implementation of multiplication of long ints. I think the multiplication of large numbers and their addition should be parallelizable. So a multi-threaded sub-O(N) solution should be possible even though (as Daniel Fisher states in comments) the F(N) solution is Theta(n).
Since Fibonacci sequence is a linear recurrence, its members can be evaluated in closed form. This involves computing a power, which can be done in O(logn) similarly to the matrix-multiplication solution, but the constant overhead should be lower. That's the fastest algorithm I know.
EDIT
Sorry, I missed the "exact" part. Another exact O(log(n)) alternative for the matrix-multiplication can be calculated as follows
from functools import lru_cache
#lru_cache(None)
def fib(n):
if n in (0, 1):
return 1
if n & 1: # if n is odd, it's faster than checking with modulo
return fib((n+1)//2 - 1) * (2*fib((n+1)//2) - fib((n+1)//2 - 1))
a, b = fib(n//2 - 1), fib(n//2)
return a**2 + b**2
This is based on the derivation from a note by Prof. Edsger Dijkstra. The solution exploits the fact that to calculate both F(2N) and F(2N-1) you only need to know F(N) and F(N-1). Nevertheless, you are still dealing with long-number arithmetics, though the overhead should be smaller than that of the matrix-based solution. In Python you'd better rewrite this in imperative style due to the slow memoization and recursion, though I wrote it this way for the clarity of the functional formulation.
This is too long for a comment, so I'll leave an answer.
The answer by Aaron is correct, and I've upvoted it, as should you. I will provide the same answer, and explain why it is not only correct, but the best answer posted so far. The formula we're discussing is:
Computing Φ is O(M(n)), where M(n) is the complexity of multiplication (currently a little over linearithmic) and n is the number of bits.
Then there's a power function, which can be expressed as a log (O(M(n)•log(n)), a multiply (O(M(n))), and an exp (O(M(n)•log(n)).
Then there's a square root (O(M(n))), a division (O(M(n))), and a final round (O(n)).
This makes this answer something like O(n•log^2(n)•log(log(n))) for n bits.
I haven't thoroughly analyzed the division algorithm, but if I'm reading this right, each bit might need a recursion (you need to divide the number log(2^n)=n times) and each recursion needs a multiply. Therefore it can't be better than O(M(n)•n), and that's exponentially worse.
Using the weird square rooty equation in the other answer closed form fibo you CAN compute the kth fibonacci number exactly. This is because the $\sqrt(5)$ falls out in the end. You just have to arrange your multiplication to keep track of it in the meantime.
def rootiply(a1,b1,a2,b2,c):
''' multipy a1+b1*sqrt(c) and a2+b2*sqrt(c)... return a,b'''
return a1*a2 + b1*b2*c, a1*b2 + a2*b1
def rootipower(a,b,c,n):
''' raise a + b * sqrt(c) to the nth power... returns the new a,b and c of the result in the same format'''
ar,br = 1,0
while n != 0:
if n%2:
ar,br = rootiply(ar,br,a,b,c)
a,b = rootiply(a,b,a,b,c)
n /= 2
return ar,br
def fib(k):
''' the kth fibonacci number'''
a1,b1 = rootipower(1,1,5,k)
a2,b2 = rootipower(1,-1,5,k)
a = a1-a2
b = b1-b2
a,b = rootiply(0,1,a,b,5)
# b should be 0!
assert b == 0
return a/2**k/5
if __name__ == "__main__":
assert rootipower(1,2,3,3) == (37,30) # 1+2sqrt(3) **3 => 13 + 4sqrt(3) => 39 + 30sqrt(3)
assert fib(10)==55
From Wikipedia,
For all n ≥ 0, the number Fn is the closest integer to phi^n/sqrt(5) where phi is the golden ratio. Therefore, it can be found by rounding, that is by the use of the nearest integer function
What is the time complexity of generating all permutations of a list in Python? The following code is what needs to have its time complexity computed. How do I do this?
def permute(list):
tot_list = []
if len(list) == 1:
return [list]
for permutation in permute(list[1:]):
for i in range(len(list)):
tot_list.append(permutation[:i] + list[0:1] + permutation[i:])
return tot_list
The main challenge with analyzing this function is that there aren't that many recursive calls, but each call returns a progressively larger and larger list of elements. In particular, note that there are n! permutations of a list of n elements. Therefore, we know that if you make a recursive call on a list of size n, there will be n! elements returned.
So let's see how this is going to influence the runtime. If you have a list of just one element, the time complexity is O(1). Otherwise, let's suppose that you have n + 1 elements in the list. The algorithm then
Makes a recursive call on a list of size n.
For each element returned, splices the first element of the list at all possible positions.
Returns the result.
We can analyze the total runtime for the recursion by just looking at the work done at each level of the recursion, meaning that we'll focus on steps (2) and (3) right now.
Notice that in step 2, if there are n + 1 elements in the list, we will have to look at n! permutations generated by the recursive call. Each of those permutations have n elements in them. For each permutation, we make n + 1 new lists, each of which has n + 1 elements in it. Therefore, we have n! loop iterations, each of which does (n + 1)2 work. Consequently, the total work done at this level is (roughly) (n + 1)2 · n!. We can note that (n + 1) · n! = (n + 1)!, so the work done could be written as (n + 1)(n + 1)!. This is strictly less than (n + 2)!, so we could say that the work done when there are n + 1 total elements is O((n + 2)!). (Note that we can't say that this is O(n!), because n! = o((n + 2)!)).
So now we can say that the total work done is (roughly speaking) given by
1! + 2! + 3! + ... + (n + 1)!
To the best of my knowledge, this doesn't have a nice, closed-form formula. However, we can note that
1! + 2! + 3! + ... + (n + 1)!
< (n + 1)! + (n + 1)! + ... + (n + 1)!
= (n + 2)(n + 1)!
= (n + 2)!
So the overall expression is O((n + 2)!). Similarly, we have that
1! + 2! + 3! + ... + (n + 1)!
> (n + 1)!
So the overall expression is Ω((n + 1)!). In other words, the true answer is sandwiched (asymptotically) somewhere between (n + 1)! and (n + 2)!. Therefore, the runtime grows factorially fast.
Hope this helps!