Suppose we have the function below:
def func(x, value_):
assert 0 < x < value_
while x < value_:
x *= 2
Although value_ can be arbitrarily large, the while loop is not infinite and the number of comparisons is bounded above by value_. Consequently, is it correct that this function has computational complexity of O(N)?
It's O(log n) as x increases by doubling value toward _value for every execution. Try draw a graph of two lines you will see it.
The time complexity will be O(log(m/n, 2)), where m = value_ and n = x. Here, log(i, 2) represents the logarithmic of i in base 2.
Consider that if x is doubled, for a fixed value_, one less iteration is computed.
On the contrary, if value_ is doubled, for a fixed x, one extra iteration is computed.
Related
I have the two following algorithms. My analysis says that both of them are O(m^2+4^n) i.e they are equivalent for big numbers. Is this right?. Note that m and n are the bits numbers for x and y
def pow1(x, y):
if y == 0:
return 1
temp = x
while y > 1:
y -= 1
temp *= x
return temp
def pow2(x, y):
if y == 0:
return 1
temp = pow2(x, y//2)
if y & 1: return temp * temp * x
return temp * temp
Whether the divide-and-conquer algorithm is more efficient depends on a ton of factors. In Python it is more efficient.
Your analysis is right; assuming standard grade-school multiplication, divide-and-conquer does fewer, more expensive multiplications, and asymptotically that makes total runtime a wash (constant factors probably matter -- I'd still guess divide-and-conquer would be faster because the majority of the work is happening in optimized C rather than in Python loop overhead, but that's just a hunch, and it'd be hard to test given that Python doesn't use an elementary multiplication algorithm).
Before going further, note that big integer multiplication in Python is little-o of m^2. In particular, it uses karatsuba, which is around O(m^0.58 n) for an m-bit integer and an n-bit integer with m<=n.
The small terms using ordinary multiplication won't matter asymptotically, so focusing on the large ones we can replace the multiplication cost and find that your iterative algorithm is around O(4^n m^1.58) and your divide-and-conquer solution is around O(3^n m^1.58).
I believe the space complexity would just be O(n) since the set is the only one that is stored throughout the program and the list is recalculated each time. I'm not sure if the time complexity would be O(n^2) because there is a while loop and inside there is a for loop or if it is something different because the while loop can just keep running if n is never 1 or in the set.
def isHappy(self,n):
seen = set()
while True:
if n not in seen:
seen.add(n)
n = sum([int(x) * int(x) for x in str(n)])
if n == 1:
return True
else:
return False
EDIT:
The previous statement about average time complexity was incorrect, as it did not take the complexity of the summing of squares of n's decimal digits into account.
Forgive me in advance for the lack of mathematical formatting. There's no easy way to do that in StackOverflow posts.
The short answer is that your solution will not enter an infinite loop, and it does indeed have O(n) space complexity and O(n**2) time complexity.
Here's the long answer:
Let f(n) denote the result of summing the squares of n's decimal digits, as is being done inside the while loop. If n has four or more digits, then f(n) is guaranteed to have fewer digits than n, as
f(9999) == 4 * 9**2 == 324
, and the difference between 10**k - 1 and f(10**k - 1) increases as k increases. So it takes, at most, log10(n) iterations of the loop to get to a three digit number for an n with four or more digits. And as
f(999) == 3 * 9**2 == 243
, no matter how many times you apply n = f(n) to an n with three or fewer digits, the result will also have three or fewer digits. There are only 1000 nonnegative integers with three or fewer digits, so by the Pigeonhole Principle, f(n) will either equal one or already be contained in the set after at most 1001 iterations. In total, that's no greater than log10(n) + 1001 iterations of the loop, where in this case n refers to the original value of the function argument.
For a set s, insertion and membership testing are both O(len(s)) in the worst case. Since the set can contain only as many elements as there are past iterations,
len(s) <= log10(n) + 1001.
And log10(n) + 1001 is O(n) (but not O(log(n)), since complexity is in terms of the size of the input (the number of digits), not the input itself). And since, during a given iteration, n either has fewer than its original number of digits or fewer than four digits, the summing of squares is also O(n) in the number of digits. In total, that's O(n) iterations that are O(n) each, for a total worst-case time complexity of O(n**2).
As explained above, you're guaranteed to reach a three-digit number eventually no matter how large n is, so you can actually replace the set with a list of 1000 bools. Then the solution would have O(1) space complexity.
The best case scenario is O(1) suppose if the program gets answer in the first attempt and the worst case scenario might be O(n^2) since the loop iterates over itself again and again but if you want a more precise answer than you can consider adding a new constant which will represent the following loop
sum([int(x) * int(x) for x in str(n)])
lets represent this loop with a constant r then the worst case complexity will become O(n^2 + r)
Consider this code:
import matplotlib.pyplot as plt
# Returns 2^n
def pow(n):
if n == 0:
return 1
x = pow(n//2)
if n%2 == 0:
return x*x
return 2*x*x
y = [10^4, 10^5,10^6, 10^7, 10^8, 10^9, 10^10]
z = []
for n in y:
start = time.time()
pow(n)
print(n, time.time() - start) # elapsed time
z.append(time.time()-start)
plt.plot(y,z)
plt.show()
I am trying to figure out what's the time complexity of the recursive
function pow(n).
I calculated the time complexity as O(log(n)) but when using the function
time.time() the function appears to be linear. How come?
Why is the time complexity O(n) and not O(log(n))?
If you replace all appearances of x*x in your example with a constant (e.g. 1) or multiplication with addition, you will see that you indeed get O(log(n)) complexity, as your function is invoked log(n) times (though we measure really small times in this case, so the results of using time might not reflect the complexity of the function in this case). I think the conclusion is that your assumption that multiplication is O(1) is not correct (see e.g. this question), in particular as the numbers you multiply are really large and don't fit anymore in a traditional 32/64-bit representation.
The question is available here. My Python code is
def solution(A, B):
if len(A) == 1:
return [1]
ways = [0] * (len(A) + 1)
ways[1], ways[2] = 1, 2
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
result = [1] * len(A)
for i in xrange(len(A)):
result[i] = ways[A[i]] & ((1<<B[i]) - 1)
return result
The detected time complexity by the system is O(L^2) and I can't see why. Thank you in advance.
First, let's show that the runtime genuinely is O(L^2). I copied a section of your code, and ran it with increasing values of L:
import time
import matplotlib.pyplot as plt
def solution(L):
if L == 0:
return
ways = [0] * (L+5)
ways[1], ways[2] = 1, 2
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
points = []
for L in xrange(0, 100001, 10000):
start = time.time()
solution(L)
points.append(time.time() - start)
plt.plot(points)
plt.show()
The result graph is this:
To understand why this O(L^2) when the obvious "time complexity" calculation suggests O(L), note that "time complexity" is not a well-defined concept on its own since it depends on which basic operations you're counting. Normally the basic operations are taken for granted, but in some cases you need to be more careful. Here, if you count additions as a basic operation, then the code is O(N). However, if you count bit (or byte) operations then the code is O(N^2). Here's the reason:
You're building an array of the first L Fibonacci numbers. The length (in digits) of the i'th Fibonacci number is Theta(i). So ways[i] = ways[i-1] + ways[i-2] adds two numbers with approximately i digits, which takes O(i) time if you count bit or byte operations.
This observation gives you an O(L^2) bit operation count for this loop:
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
In the case of this program, it's quite reasonable to count bit operations: your numbers are unboundedly huge as L increases and addition of huge numbers is linear in clock time rather than O(1).
You can fix the complexity of your code by computing the Fibonacci numbers mod 2^32 -- since 2^32 is a multiple of 2^B[i]. That will keep a finite bound on the numbers you're dealing with:
for i in xrange(3, len(ways)):
ways[i] = (ways[i-1] + ways[i-2]) & ((1<<32) - 1)
There are some other issues with the code, but this will fix the slowness.
I've taken the relevant parts of the function:
def solution(A, B):
for i in xrange(3, len(A) + 1): # replaced ways for clarity
# ...
for i in xrange(len(A)):
# ...
return result
Observations:
A is an iterable object (e.g. a list)
You're iterating over the elements of A in sequence
The behavior of your function depends on the number of elements in A, making it O(A)
You're iterating over A twice, meaning 2 O(A) -> O(A)
On point 4, since 2 is a constant factor, 2 O(A) is still in O(A).
I think the page is not correct in its measurement. Had the loops been nested, then it would've been O(A²), but the loops are not nested.
This short sample is O(N²):
def process_list(my_list):
for i in range(0, len(my_list)):
for j in range(0, len(my_list)):
# do something with my_list[i] and my_list[j]
I've not seen the code the page is using to 'detect' the time complexity of the code, but my guess is that the page is counting the number of loops you're using without understanding much of the actual structure of the code.
EDIT1:
Note that, based on this answer, the time complexity of the len function is actually O(1), not O(N), so the page is not incorrectly trying to count its use for the time-complexity. If it were doing that, it would've incorrectly claimed a larger order of growth because it's used 4 separate times.
EDIT2:
As #PaulHankin notes, asymptotic analysis also depends on what's considered a "basic operation". In my analysis, I've counted additions and assignments as "basic operations" by using the uniform cost method, not the logarithmic cost method, which I did not mention at first.
Most of the time simple arithmetic operations are always treated as basic operations. This is what I see most commonly being done, unless the algorithm being analysed is for a basic operation itself (e.g. time complexity of a multiplication function), which is not the case here.
The only reason why we have different results appears to be this distinction. I think we're both correct.
EDIT3:
While an algorithm in O(N) is also in O(N²), I think it's reasonable to state that the code is still in O(N) b/c, at the level of abstraction we're using, the computational steps that seem more relevant (i.e. are more influential) are in the loop as a function of the size of the input iterable A, not the number of bits being used to represent each value.
Consider the following algorithm to compute an:
def function(a, n):
r = 1
for i in range(0, n):
r *= a
return r
Under the uniform cost method, this is in O(N), because the loop is executed n times, but under logarithmic cost method, the algorithm above turns out to be in O(N²) instead due to the time complexity of the multiplication at line r *= a being in O(N), since the number of bits to represent each number is dependent on the size of the number itself.
Codility Ladder competition is best solved in here:
It is super tricky.
We first compute the Fibonacci sequence for the first L+2 numbers. The first two numbers are used only as fillers, so we have to index the sequence as A[idx]+1 instead of A[idx]-1. The second step is to replace the modulo operation by removing all but the n lowest bits
I need to compute a ratio of two number that are computed in a cycle.
The problem is that b becomes too big and it is equal to numpy.inf at some point.
However, the ratio a/b should exist and not be zero.
for i in range(num_iter):
a += x[i]
b += y[i]
return a/b
What are tricks to compute this type of limits?
Please let me know if it is a wrong stackexchange site for the question.
Update:
The loop is finite, I have two arrays x and y that can be analysed in advance on big number or something.
I guess dividing x and y by some large number (rescaling) might work?
You don't say what you are adding to a and b each time through the loop, but presumably both values get so large that any error introduced by truncating the increments to integers will be negligible in the limit. This way, you use arbitrary integers rather than floating-point values, which have both an upper bound on their magnitude and limited precision.
for i in range(num_iter):
a += int(...)
b += int(...)
return a/b
Building on Chepner's idea, how about tracking the float and the int part separately, then bringing the int part back when it it is larger than 1. Something like this:
for i in range(num_iter):
afloat += ... - int(...)
bfloat += ... - int(...)
a += int(...) + int(afloat)
b += int(...) + int(bfloat)
afloat += int(afloat)
bfloat += int(bfloat)
return a/b
If a and b have the same length, you know that the ratio of the means is equal to the ratio of the sum. If it isn't, you can use the ratio of the number of items to correct you ratio.
for i in xrange(num_iter):
numpy.append(a, ...)
numpy.append(b, ...)
return (mean(a)/mean(b)) * (float(len(b))/len(a))
It could be slow and it will use more memory, but I think it should work.
If you don't want to save everything, you can calculate the mean for every N elements, and do a weighted mean when you need to calculate it.