Big O notation- having hard time proving it - python

I need to prove t(n) is O(n!)
if t(n) = (n!)(n-1)
this is the formula I'm working with? any suggestions?
(n!)(n-1) <= c(n!)
I'm having a hard time proving this
would this formula work instead?
(n!)(n-1) <= c(n * n!)

It isn't O(n!). You have the right equation that would need to be true if n!(n-1) = O(n!):
n!(n-1) <= cn!
But then dividing both sides by n! gives:
n-1 <= c
There's no constant c that's greater than all positive integers, so you have a contradiction.

Related

Should I consider whether an input value is even or odd when calculating time complexity?

I have the following function:
def pascal_triangle(i:int,j:int):
j = min(j, i-j + 1)
if i == 1 or j == 1:
return 1
elif j > i or j == 0 or i < 1 or j < 1:
return 0
else:
return pascal_triangle(i-1,j-1) + pascal_triangle(i-1,j)
The input value for j has the following constraint:
1<=j<=i/2
My computation for the time complexity is as follows:
f(i,j) = f(i-1,j-1) + f(i-1,j) = f(i-n,j-n) + ... + f(i-n,j)
So, to find the max n, we have:
i-n>=1
j-n>=1
i-n>=j
and, since we know that:
j>=1
j<=i/2
The max n is i/2-1, so the time complexity is O(2^(i/2-1)), and the space complexity is the maximum depth of recursion(n) times needed space for each time(O(1)), O(2^*(i/2-1)).
I hope my calculation is correct. Now my concern is that if i is odd, This number is not divisible by 2, but the terms of function must be an integer. Therefore, I want to know should I write the time complexity like this:
The time complexity and space complexity of the function are both:
O(2^(i/2-1)) if i is even
O(2^(i/2-0.5)) if i is odd
At a first glance, the time and space analysis looks (roughly) correct. I haven't made a super close inspection, however, since it doesn't appear to be the focus of the question.
Regarding the time complexity for even / odd inputs, the answer is that the time complexity is O(sqrt(2)^i), regardless of whether i is even or odd.
In the even case, we have O(2^(i / 2 - 1)) ==> O(1/2 * sqrt(2)^i) ==> O(sqrt(2)^i).
In the odd case, we have O(2^(i / 2 - 0.5)) ==> O(sqrt(2) / 2 * sqrt(2)^i) ==> O(sqrt(2)^i).
What you've written is technically correct, but significantly more verbose than necessary. (At the very least, it's poor style, and if this question was on a homework assignment or an exam, I personally think one could justify a penalty on this basis.)

Run-time Complexity for two algorithms (Big O notation calculation)

what is the Big O notation for that two algorithms:
def foo1(n):
if n > 1:
for i in range(int(n)):
foo1(1)
foo1(n / 2)
def foo2(lst1, lst2):
i = 1
while i < max(len(lst1), len(lst2)):
j = 1
while j < min(len(lst1), len(lst2)):
j *= 2
i *= 2
I thought that foo1 run time complexity is O(n) because in that case if I see the for loop I can do that:
T(n) = O(n) + O(n/2) <= c*O(n) (c is const) for all n.
is that right ?
and I cant calculate the run time of foo2 can some one help me to know to do that.
thanks...
The number of operations T(n) is equal to T(n/2) + n. Applying the Master theorem we get T(n) = O(n). In simple terms there are n + n/2 + n/4 + ... + 1 operations that are less than 2*n and is O(n).
The inner loop does not depend on the outer loop, so we can treat them independently. T(n) = O(log(maxlen) * log(minlen)).

sum of even fibonacci numbers up to 4million

The method I've used to try and solve this works but I don't think it's very efficient because as soon as I enter a number that is too large it doesn't work.
def fib_even(n):
fib_even = []
a, b = 0, 1
for i in range(0,n):
c = a+b
if c%2 == 0:
fib_even.append(c)
a, b = b, a+b
return fib_even
def sum_fib_even(n):
fib_evens = fib_even(n)
s = 0
for i in fib_evens:
s = s+i
return s
n = 4000000
answer = sum_fib_even(n)
print answer
This for example doesn't work for 4000000 but will work for 400. Is there a more efficient way of doing this?
It is not necessary to compute all the Fibonacci numbers.
Note: I use in what follows the more standard initial values F[0]=0, F[1]=1 for the Fibonacci sequence. Project Euler #2 starts its sequence with F[2]=1,F[3]=2,F[4]=3,.... For this problem the result is the same for either choice.
Summation of all Fibonacci numbers (as a warm-up)
The recursion equation
F[n+1] = F[n] + F[n-1]
can also be read as
F[n-1] = F[n+1] - F[n]
or
F[n] = F[n+2] - F[n+1]
Summing this up for n from 1 to N (remember F[0]=0, F[1]=1) gives on the left the sum of Fibonacci numbers, and on the right a telescoping sum where all of the inner terms cancel
sum(n=1 to N) F[n] = (F[3]-F[2]) + (F[4]-F[3]) + (F[5]-F[4])
+ ... + (F[N+2]-F[N+1])
= F[N+2] - F[2]
So for the sum using the number N=4,000,000 of the question one would have just to compute
F[4,000,002] - 1
with one of the superfast methods for the computation of single Fibonacci numbers. Either halving-and-squaring, equivalent to exponentiation of the iteration matrix, or the exponential formula based on the golden ratio (computed in the necessary precision).
Since about every 20 Fibonacci numbers you gain 4 additional digits, the final result will consist of about 800000 digits. Better use a data type that can contain all of them.
Summation of the even Fibonacci numbers
Just inspecting the first 10 or 20 Fibonacci numbers reveals that all even members have an index of 3*k. Check by subtracting two successive recursions to get
F[n+3]=2*F[n+2]-F[n]
so F[n+3] always has the same parity as F[n]. Investing more computation one finds a recursion for members three indices apart as
F[n+3] = 4*F[n] + F[n-3]
Setting
S = sum(k=1 to K) F[3*k]
and summing the recursion over n=3*k gives
F[3*K+3]+S-F[3] = 4*S + (-F[3*K]+S+F[0])
or
4*S = (F[3*K]+F[3*K]) - (F[3]+F[0]) = 2*F[3*K+2]-2*F[2]
So the desired sum has the formula
S = (F[3*K+2]-1)/2
A quick calculation with the golden ration formula reveals what N should be so that F[N] is just below the boundary, and thus what K=N div 3 should be,
N = Floor( log( sqrt(5)*Max )/log( 0.5*(1+sqrt(5)) ) )
Reduction of the Euler problem to a simple formula
In the original problem, one finds that N=33 and thus the sum is
S = (F[35]-1)/2;
Reduction of the problem in the question and consequences
Taken the mis-represented problem in the question, N=4,000,000, so K=1,333,333 and the sum is
(F[1,333,335]-1)/2
which still has about 533,400 digits. And yes, biginteger types can handle such numbers, it just takes time to compute with them.
If printed in the format of 60 lines a 80 digits, this number fills 112 sheets of paper, just to get the idea what the output would look like.
It should not be necessary to store all intermediate Fibonacci numbers, perhaps the storage causes a performance problem.

Generating random numbers under very specific constraints

I am faced with the following programming problem. I need to generate n (a, b) tuples for which the sum of all a's is a given A and sum of all b's is a given B and for each tuple the ratio of a / b is in the range (c_min, c_max). A / B is within the same range, too. I am also trying to make sure there is no bias in the result other than what is introduced by the constraints and the a / b values are more-or-less uniformly distributed in the given range.
Some clarifications and meta-constraints:
A, B, c_min, and c_max are given.
The ratio A / B is in the (c_min, c_max) range. This has to be so if the problem is to have a solution given the other constraints.
a and b are >0 and non-integer.
I am trying to implement this in Python but ideas in any language (English included) are much appreciated.
We look for tuples a_i and b_i such that
(a_1, ... a_n) and (b_1, ... b_n) have a distribution which is invariant under permutation of indices (what you would call "unbiased")
the ratios a_i / b_i are uniformly distributed on [cmin, cmax]
sum(a_i) = A, sum(b_i) = B
If c_min and c_max are not too ill conditioned (ie they are not very close to another), and n is not very large, the following works:
Generate a_i "uniformly" such that sum a_i = A:
Draw n samples aa_i (i = 1..n) from some distribution (eg. uniform)
Divide them by their sum and multiply by A: a_i = A * aa_i / sum(aa_i) has desired properties.
Generate b_i such that sum b_i = B by the same method.
If there exists i such that a_i / b_i is not in the interval [cmin, cmax], throw away all the a_i and b_i and try again from the beginning.
It doesn't scale well with n, because the set of a_i and b_i satisfying the constraints gets more and more narrow as n increases (and so you reject more candidates).
To be honest, I don't see any other simple solution. If n gets large and cmin ~ cmax, then you will have to use a sledgehammer (eg. MCMC) to generate samples from your distribution, unless there is some trick we did not see.
If you really want to use MCMC algorithms, note that you can change cmin to cmin * B / A (likewise for cmax) and assume A == B == 1. The problem is then to draw uniformly on the product of two unit n-simplices (u_1...u_n, v_1...v_n) such that
u_i / v_i \in [cmin, cmax].
So you have to use a MCMC algorithm (Metropolis-Hastings seems more suited) on the product of two unit n-simplices with the density
f(u_1, ..., u_n, v_1, ..., v_n) = \prod indicator_{u_i/v_i \in [cmin, cmax]}
which is definitely doable (albeit involved).
Start by generating as many identical tuples, n, as you need:
(A/n, B/n)
Now pick two tuples at random. Make a random change to the a value of one, and a compensating change to the a value of the other, keeping everything within the given constraints. Put the two tuples back.
Now pick another random pair. This times twiddle with the b values.
Lather, rinse repeat.
I think the simplest thing is to
Use your favorite method to throw n-1 values such that \sum_i=0,n-1 a_i < A, and set a_n to get the right total. There are several SO question about doing that, though I've never seen a answer I'm really happy with yet. Maybe I'll write a paper or something.
Get the n-1 b's by throwing the c_i uniformly on the allowed range, and set final b to get the right total and check on the final c (I think it must be OK, but I haven't proven it yet).
Note that since we have 2 hard constrains we should expect to throw 2n-2 random numbers, and this method does exactly that (on the assumption that you can do step 1 with n-1 throws.
Blocked Gibbs sampling is pretty simple and converges to the right distribution (this is along the lines of what Alexandre is proposing).
For all i, initialize ai = A / n and bi = B / n.
Select i ≠ j uniformly at random. With probability 1/2, update ai and aj with uniform random values satisfying the constraints. The rest of the time, do the same for bi and bj.
Repeat Step 2 as many times as seems to be necessary for your application. I have no idea what the convergence rate is.
Lots of good ideas here. Thanks! Rossum's idea seemed the most straightforward implementation-wise so I went for it. Here is the code for posterity:
c_min = 0.25
c_max = 0.75
a_sum = 100.0
b_sum = 200.0
n = 1000
a = [a_sum / n] * n
b = [b_sum / n] * n
while not good_enough(a, b):
i, j = random.sample(range(n), 2)
li, ui = c_min * b[i] - a[i], c_max * b[i] - a[i]
lj, uj = a[j] - c_min * b[j], a[j] - c_max * b[j]
llim = max((li, uj))
ulim = min((ui, lj))
q = random.uniform(llim, ulim)
a[i] += q
a[j] -= q
i, j = random.sample(range(n), 2)
li, ui = a[i] / c_max - b[i], a[i] / c_min - b[i]
lj, uj = b[j] - a[j] / c_max, b[j] - a[j] / c_min
llim = max((li, uj))
ulim = min((ui, lj))
q = random.uniform(llim, ulim)
b[i] += q
b[j] -= q
The good_enough(a, b) function can be a lot of things. I tried:
Standard deviation, which is hit or miss, as you don't know what is a good enough value.
Kurtosis, where a large negative value would be nice. However, it is relatively slow to calculate and is undefined with the seed values of (a_sum / n, b_sum / n) (though that's trivial to fix).
Skewness, where a value close to 0 is desirable. But it has the same drawbacks as kurtosis.
A number of iterations proportional to n. 2n sometimes wasn't enough, n ^ 2 is a little bit of overkill and is, well, exponential.
Ideally, a heuristic using a combination of skewness and kurtosis would be best but I settled for making sure each value has been changed from the initial (again, as rossum suggested in a comment). Though there is no theoretical guarantee that the loop will complete, it seemed to work well enough for me.
So here's what I think from mathematical point of view. We have sequences a_i and b_i such that sum of a_i is A and sum of b_i is B. Furthermore A/B is in (x,y) and so is a_i/b_i for each i. Furthermore you want a_i/b_i to be uniformly distributed in (x,y).
So do it starting from the end. Choose c_i from (x,y) such that they are uniformly distributed. Then we want to have the following equality a_i/b_i = c_i, so a_i = b_i*c_i.
Therefore we only need to find b_i. But we have the following system of linear equations:
A = (sum)b_i*c_i
B = (sum)b_i
where b_i are variables. Solve it (some fancy linear algebra tricks) and you're done!
Note that for large enough n this system will have lots of solutions. They will be dependent on some parameters which you can choose randomly.
Enough of the theoretical approach, let's see some practical solution.
// EDIT 1: Here's some hard core Python code :D
import random
min = 0.0
max = 10.0
A = 500.0
B = 100.0
def generate(n):
C = [min + i*(max-min)/(n+1) for i in range(1, n+1)]
Y = [0]
for i in range(1,n-1):
# This line should be changed in order to always get positive numbers
# It should be relatively easy to figure out some good random generator
Y.append(random.random())
val = A - C[0]*B
for i in range(1, n-1):
val -= Y[i] * (C[i] - C[0])
val /= (C[n-1] - C[0])
Y.append(val)
val = B
for i in range(1, n):
val -= Y[i]
Y[0] = val
result = []
for i in range(0, n):
result.append([ Y[i]*C[i], Y[i] ])
return result
The result is a list of pairs (X,Y) satisfying your conditions with the exception that they may be negative (see the random generator line in code) i.e. the first and the last pair may contain negative numbers.
// EDIT 2:
Too ensure that they are positive you may try something like
Y.append(random.random() * B / n)
instead of
Y.append(random.random())
I'm not sure though.
// EDIT 3:
In order to have better results try something like this:
avrg = B / n
ran = avrg / 20
for i in range(1, n-1):
Y.append(random.gauss(avrg, ran))
instead of
for i in range(1, n-1):
Y.append(random.random())
This will make all b_i to be near B / n. Unfortunetly the last term will still sometimes jump high. I'm sorry, but there is no way to avoid this (mathematics) since the last and the first terms depend on the others. For small n (~100) it looks good though. Unfortunetly some negative values may appear.
The choice of a correct generator is not so simple if you additionally want b_i to be uniformly distributed.

Efficient python code for printing the product of divisors of a number

I am trying to solve a problem involving printing the product of all divisors of a given number. The number of test cases is a number 1 <= t <= 300000 , and the number itself can range from 1 <= n <= 500000
I wrote the following code, but it always exceeds the time limit of 2 seconds. Are there any ways to speed up the code ?
from math import sqrt
def divisorsProduct(n):
ProductOfDivisors=1
for i in range(2,int(round(sqrt(n)))+1):
if n%i==0:
ProductOfDivisors*=i
if n/i != i:
ProductOfDivisors*=(n/i)
if ProductOfDivisors <= 9999:
print ProductOfDivisors
else:
result = str(ProductOfDivisors)
print result[len(result)-4:]
T = int(raw_input())
for i in range(1,T+1):
num = int(raw_input())
divisorsProduct(num)
Thank You.
You need to clarify by what you mean by "product of divisors." The code posted in the question doesn't work for any definition yet. This sounds like a homework question. If it is, then perhaps your instructor was expecting you to think outside the code to meet the time goals.
If you mean the product of unique prime divisors, e.g., 72 gives 2*3 = 6, then having a list of primes is the way to go. Just run through the list up to the square root of the number, multiplying present primes into the result. There are not that many, so you could even hard code them into your program.
If you mean the product of all the divisors, prime or not, then it is helpful to think of what the divisors are. You can make serious speed gains over the brute force method suggested in the other answers and yours. I suspect this is what your instructor intended.
If the divisors are ordered in a list, then they occur in pairs that multiply to n -- 1 and n, 2 and n/2, etc. -- except for the case where n is a perfect square, where the square root is a divisor that is not paired with any other.
So the result will be n to the power of half the number of divisors, (regardless of whether or not n is a square).
To compute this, find the prime factorization using your list of primes. That is, find the power of 2 that divides n, then the power of 3, etc. To do this, take out all the 2s, then the 3s, etc.
The number you are taking the factors out of will be getting smaller, so you can do the square root test on the smaller intermediate numbers to see if you need to continue up the list of primes. To gain some speed, test p*p <= m, rather than p <= sqrt(m)
Once you have the prime factorization, it is easy to find the number of divisors. For example, suppose the factorization is 2^i * 3^j * 7^k. Then, since each divisor uses the same prime factors, with exponents less than or equal to those in n including the possibility of 0, the number of divisors is (i+1)(j+1)(k+1).
E.g., 72 = 2^3 * 3^2, so the number of divisors is 4*3 = 12, and their product is 72^6 = 139,314,069,504.
By using math, the algorithm can become much better than O(n). But it is hard to estimate your speed gains ahead of time because of the relatively small size of the n in the input.
You could eliminate the if statement in the loop by only looping to less than the square root, and check for square root integer-ness outside the loop.
It is a rather strange question you pose. I have a hard time imagine a use for it, other than it possibly being an assignment in a course. My first thought was to pre-compute a list of primes and only test against those, but I assume you are quite deliberately counting non-prime factors? I.e., if the number has factors 2 and 3, you are also counting 6.
If you do use a table of pre-computed primes, you would then have to also subsequently include all possible combinations of primes in your result, which gets more complex.
C is really a great language for that sort of thing, because even suboptimal algorithms run really fast.
Okay, I think this is close to the optimal algorithm. It produces the product_of_divisors for each number in range(500000).
import math
def number_of_divisors(maxval=500001):
""" Example: the number of divisors of 12 is 6: 1, 2, 3, 4, 6, 12.
Given a prime factoring of n, the number of divisors of n is the
product of each factor's multiplicity plus one (mpo in my variables).
This function works like the Sieve of Eratosthenes, but marks each
composite n with the multiplicity (plus one) of each prime factor. """
numdivs = [1] * maxval # multiplicative identity
currmpo = [0] * maxval
# standard logic for 2 < p < sqrt(maxval)
for p in range(2, int(math.sqrt(maxval))):
if numdivs[p] == 1: # if p is prime
for exp in range(2,50): # assume maxval < 2^50
pexp = p ** exp
if pexp > maxval:
break
exppo = exp + 1
for comp in range(pexp, maxval, pexp):
currmpo[comp] = exppo
for comp in range(p, maxval, p):
thismpo = currmpo[comp] or 2
numdivs[comp] *= thismpo
currmpo[comp] = 0 # reset currmpo array in place
# abbreviated logic for p > sqrt(maxval)
for p in range(int(math.sqrt(maxval)), maxval):
if numdivs[p] == 1: # if p is prime
for comp in range(p, maxval, p):
numdivs[comp] *= 2
return numdivs
# this initialization times at 7s on my machine
NUMDIV = number_of_divisors()
def product_of_divisors(n):
if NUMDIV[n] % 2 == 0:
# each pair of divisors has product equal to n, for example
# 1*12 * 2*6 * 3*4 = 12**3
return n ** (NUMDIV[n] / 2)
else:
# perfect squares have their square root as an unmatched divisor
return n ** (NUMDIV[n] / 2) * int(math.sqrt(n))
# this loop times at 13s on my machine
for n in range(500000):
a = product_of_divisors(n)
On my very slow machine, it takes 7s to compute the numberofdivisors for each number, then 13s to compute the productofdivisors for each. Of course it can be sped up by translating it into C. (#someone with a fast machine: how long does it take on your machine?)

Categories

Resources