I was curious about the time complexity of Python's itertools.combinations function. I did some searching and it seems that many resources claim the time complexity is O(n!) or O(nCr).
However, for the two extremes of when r = 1 and r = n, the formula nCr reduces to n and 1, respectively. Does this mean we can conclude that the time complexity of itertools.combinations is O(n)?
r=1 and r=n rather are (almost) best cases (actually r=0 is the lower extreme), not worst cases. Worst case, at least for number of combinations, is r=n/2. So if you do want to express the complexity in terms of just n, it's O(nC(n/2)) or O(n × nC(n/2)), depending on what you do with the tuples.
Does this mean we can conclude that the time complexity of itertools.combinations is O(n)?
No you / we can't do that.
Consider this statement:
Algorithm X to compute the combinations of r values from a set of n is O(nCr).
It is in fact trivial to prove1 that any algorithm to do the above must be at least O(nCr). But we would need to examine the actual code to determine that a given algorithm is exactly O(nCr)
What you are doing is setting one or both of the variables to fixed values. This is effectively changing the problem.
To illustrate this, we can just substitute the fixed values into the above statement; e.g.
Algorithm X to compute the combinations of 1 value from a set of n is O(nC1).
Since nC1 is n, we can rewrite this as:
Algorithm X to compute the combinations of 1 value from a set of n is O(n).
But notice that this problem is different to the original one.
In short ... this is NOT invalidating the original statement.
Note that the (alleged) claim that itertools.combinations is O(n!) is (I think) a misreading on your part. What that "source" actually says is:
"Getting all combinations of a list is O(n!). Since you're doing that n times to get combinations with different values of r, the whole algorithm is O(n * n!)."
My reading is that it is talking about Pn not nCr. But either way, it is too vague to be considered a credible source.
1 - Informal proof. The set of combinations of r from a set of n has nCr elements. Constructing this set will entail adding nCr elements to a data structure. That involves (at least) nCr memory writes. It will take (at least) O(nCr) time to do this ... assuming a (real world) computer with an upper limit on memory bandwidth.
Related
I have a list with tuples:
tuple_list = [(1,3),(4,7),(8,1),(5,4),(9,3),(7,2),(2,7),(3,1),(8,9),(5,2)]
From this list, I want to return the minimum distance of two numbers in a tuple.
In the naive approach, I would do the following:
distance = 10
for tup in tuple_list:
if abs(tup[0]-tup[1]) < distance:
distance = abs(tup[0]-tup[1])
Then, in the end, distance would equal 1.
However, I suspect there is a faster method to obtain the minimum distance that calculates all the distances in parallel.
To be clear, in the CPython reference interpreter, parallelized computations are pretty useless; the GIL prevents you from gaining meaningful benefit from CPU-bound work like this unless the work can be done by an extension that manually releases the GIL, using non-Python types. numpy could gain you some benefit (if the data was already in a numpy array) by vectorizing (likely to do better than actual parallelization anyway, unless the data is enormous), but no matter how you slice it, the general case, for arbitrary data, will be O(n); you can't improve on that in the general case because every item must be considered, so even in ideal circumstances, you're just applying a constant divisor to the work, but it remains O(n).
You can simplify your code a bit, and use constructs that are better optimized in CPython, e.g.
distance = min(abs(d1 - d2) for d1, d2 in tuple_list)
which will compute abs(d1 - d2) only once per loop, and potentially save a little overhead over the plain for loop + if check (plus, it'll remove the need to come up with an initializer for distance that's definitely larger than the minimum that should replace it), but it's still O(n), it's just simpler code with some minor micro-optimizations.
In some special cases you could improve on this though. If you must regularly modify the list, and must be able to quickly determine the smallest difference at any given point in time, you could use a heap with precomputed differences. Adding a new item, or removing the minimum item, in the heap would be O(log n) (constructing the heap in the first place being O(n)), and getting the current smallest item would be O(1) (it's always in index 0).
Constructing the heap in the first place:
import heapq
tuple_list = [(1,3),(4,7),(8,1),(5,4),(9,3),(7,2),(2,7),(3,1),(8,9),(5,2)]
tuple_heap = [(abs(a - b), (a, b)) for a, b in tuple_list] # O(n) work
heapq.heapify(tuple_heap) # O(n) work; tuple_heap.sort() would also work,
# but it would be O(n log n)
Adding a new item (where x and y are the items to add):
heapq.heappush(tuple_heap, (abs(x - y), (x, y))) # O(log n)
Popping off the current smallest item:
diff, tup = heapq.heappop(tuple_heap) # O(log n)
# Or to unpack values:
diff, (x, y) = heapq.heappop(tuple_heap) # O(log n)
Getting values from current smallest item (without removing it):
diff, tup = tuple_heap[0] # O(1)
# Or to unpack values:
diff, (x, y) = tuple_heap[0] # O(1)
Obviously, this only make sense if you regularly need the current minimum item, and the set of things to consider is constantly changing, but it's one of the few cases where you can get better than O(n) performance in common cases, without paying more than O(n) costs in setup costs.
The only way you can optimise this would be using multi-threaded solution, and calculating the tuple-distance for each tuple in a thread, you'll see probably a time advantage for large lists, but still, in terms of complexity it will be the same O(n). Since the solution you provided is already the most optimal, it has already a time complexity of O(n), and there isn't a more optimal approach to find a minimum in a list than O(n).
I'm looking to iterate over every third element in my list. But in thinking about Big-O notation, would the Big-O complexity be O(n) where n is the number of elements in the list, or O(n/3) for every third element?
In other words, even if I specify that the list should only be iterated over every third element, is Python still looping through the entire list?
Example code:
def function(lst):
#iterating over every third list
for i in lst[2::3]:
pass
When using Big-O notation we ignore any scalar multiples out the front of the functions. This is because the algorithm still takes "linear time". We do this because Big-O notation considers the behaviour of a algorithm as it scales to large inputs.
Meaning it doesn't matter if the algorithm is considering every element of the list or every third element the time complexity still scales linearly to the input size. For example if the input size is doubled, it would take twice as long to execute, no matter if you are looking at every element or every third element.
Mathematically we can say this because of the M term in the definition (https://en.wikipedia.org/wiki/Big_O_notation):
abs(f(x)) <= M * O(f(x))
Big O notation would remain O(n) here.
Consider the following:
n = some big number
for i in range(n):
print(i)
print(i)
print(i)
Does doing 3 actions count as O(3n) or O(n)? O(n). Does the real world performance slow down by doing three actions instead of one? Absolutely!
Big O notation is about looking at the growth rate of the function, not about the physical runtime.
Consider the following from the pandas library:
# simple iteration O(n)
df = DataFrame([{a:4},{a:3},{a:2},{a:1}])
for row in df:
print(row["a"])
# iterrows iteration O(n)
for idx, row in df.iterrows():
print(row["a"])
# apply/lambda iteration O(n)
df.apply(lambda x: print(x["row"])
All of these implementations can be considered O(n) (constant is dropped), however that doesn't necessarily mean that the runtime will be the same. In fact, method 3 should be about 800 times faster than method 1 (https://towardsdatascience.com/how-to-make-your-pandas-loop-71-803-times-faster-805030df4f06)!
Another answer that may help you: Why is the constant always dropped from big O analysis?
I see this text:
one problem arises:
but you see in normal quick sort that when we have 1 element in one sub array and n-1 others in next sub array we get depth of O(n) in using stack. this is contrast to this fact? where is the problem?
Where is the misunderstanding point for me on this topic?
You are misinterpreting the meaning of "constant fraction". It does not mean you place a constant number of elements in one partition, and the remaining in another. If you put k elements in one partition, the fraction is k/n, which is not constant.
On the other hand, if you always put 1/k elements (for some constant k) in one partition, for large enough n that's a linear number of elements. For k = 100000, that's still n/100000 elements on one side and 99999/100000 n elements on the other.
The important part is "constant fraction of the elements". If you hit a degenerate Quicksort case where the partitioning puts a constant number (e.g. 1) of elements on one side of the partition, you'll get O(n^2) time complexity.
But if every partitioning call puts a constant fraction (e.g. 99%) of elements on one side, you'll still get O(n log n) (albeit with a larger constant factor).
def check_set(S, k):
S2 = k - S
set_from_S2=set(S2.flatten())
for x in S:
if(x in set_from_S2):
return True
return False
I have a given integer k. I want to check if k is equal to sum of two element of array S.
S = np.array([1,2,3,4])
k = 8
It should return False in this case because there are no two elements of S having sum of 8. The above code work like 8 = 4 + 4 so it returned True
I can't find an algorithm to solve this problem with complexity of O(n).
Can someone help me?
You have to account for multiple instances of the same item, so set is not good choice here.
Instead you can exploit dictionary with value_field = number_of_keys (as variant - from collections import Counter)
A = [3,1,2,3,4]
Cntr = {}
for x in A:
if x in Cntr:
Cntr[x] += 1
else:
Cntr[x] = 1
#k = 11
k = 8
ans = False
for x in A:
if (k-x) in Cntr:
if k == 2 * x:
if Cntr[k-x] > 1:
ans = True
break
else:
ans = True
break
print(ans)
Returns True for k=5,6 (I added one more 3) and False for k=8,11
Adding onto MBo's answer.
"Optimal" can be an ambiguous term in terms of algorithmics, as there is often a compromise between how fast the algorithm runs and how memory-efficient it is. Sometimes we may also be interested in either worst-case resource consumption or in average resource consumption. We'll loop at worst-case here because it's simpler and roughly equivalent to average in our scenario.
Let's call n the length of our array, and let's consider 3 examples.
Example 1
We start with a very naive algorithm for our problem, with two nested loops that iterate over the array, and check for every two items of different indices if they sum to the target number.
Time complexity: worst-case scenario (where the answer is False or where it's True but that we find it on the last pair of items we check) has n^2 loop iterations. If you're familiar with the big-O notation, we'll say the algorithm's time complexity is O(n^2), which basically means that in terms of our input size n, the time it takes to solve the algorithm grows more or less like n^2 with multiplicative factor (well, technically the notation means "at most like n^2 with a multiplicative factor, but it's a generalized abuse of language to use it as "more or less like" instead).
Space complexity (memory consumption): we only store an array, plus a fixed set of objects whose sizes do not depend on n (everything Python needs to run, the call stack, maybe two iterators and/or some temporary variables). The part of the memory consumption that grows with n is therefore just the size of the array, which is n times the amount of memory required to store an integer in an array (let's call that sizeof(int)).
Conclusion: Time is O(n^2), Memory is n*sizeof(int) (+O(1), that is, up to an additional constant factor, which doesn't matter to us, and which we'll ignore from now on).
Example 2
Let's consider the algorithm in MBo's answer.
Time complexity: much, much better than in Example 1. We start by creating a dictionary. This is done in a loop over n. Setting keys in a dictionary is a constant-time operation in proper conditions, so that the time taken by each step of that first loop does not depend on n. Therefore, for now we've used O(n) in terms of time complexity. Now we only have one remaining loop over n. The time spent accessing elements our dictionary is independent of n, so once again, the total complexity is O(n). Combining our two loops together, since they both grow like n up to a multiplicative factor, so does their sum (up to a different multiplicative factor). Total: O(n).
Memory: Basically the same as before, plus a dictionary of n elements. For the sake of simplicity, let's consider that these elements are integers (we could have used booleans), and forget about some of the aspects of dictionaries to only count the size used to store the keys and the values. There are n integer keys and n integer values to store, which uses 2*n*sizeof(int) in terms of memory. Add to that what we had before and we have a total of 3*n*sizeof(int).
Conclusion: Time is O(n), Memory is 3*n*sizeof(int). The algorithm is considerably faster when n grows, but uses three times more memory than example 1. In some weird scenarios where almost no memory is available (embedded systems maybe), this 3*n*sizeof(int) might simply be too much, and you might not be able to use this algorithm (admittedly, it's probably never going to be a real issue).
Example 3
Can we find a trade-off between Example 1 and Example 2?
One way to do that is to replicate the same kind of nested loop structure as in Example 1, but with some pre-processing to replace the inner loop with something faster. To do that, we sort the initial array, in place. Done with well-chosen algorithms, this has a time-complexity of O(n*log(n)) and negligible memory usage.
Once we have sorted our array, we write our outer loop (which is a regular loop over the whole array), and then inside that outer loop, use dichotomy to search for the number we're missing to reach our target k. This dichotomy approach would have a memory consumption of O(log(n)), and its time complexity would be O(log(n)) as well.
Time complexity: The pre-processing sort is O(n*log(n)). Then in the main part of the algorithm, we have n calls to our O(log(n)) dichotomy search, which totals to O(n*log(n)). So, overall, O(n*log(n)).
Memory: Ignoring the constant parts, we have the memory for our array (n*sizeof(int)) plus the memory for our call stack in the dichotomy search (O(log(n))). Total: n*sizeof(int) + O(log(n)).
Conclusion: Time is O(n*log(n)), Memory is n*sizeof(int) + O(log(n)). Memory is almost as small as in Example 1. Time complexity is slightly more than in Example 2. In scenarios where the Example 2 cannot be used because we lack memory, the next best thing in terms of speed would realistically be Example 3, which is almost as fast as Example 2 and probably has enough room to run if the very slow Example 1 does.
Overall conclusion
This answer was just to show that "optimal" is context-dependent in algorithmics. It's very unlikely that in this particular example, one would choose to implement Example 3. In general, you'd see either Example 1 if n is so small that one would choose whatever is simplest to design and fastest to code, or Example 2 if n is a bit larger and we want speed. But if you look at the wikipedia page I linked for sorting algorithms, you'll see that none of them is best at everything. They all have scenarios where they could be replaced with something better.
What is Big-O complexity of random.choice(list) in Python3, where n is amount of elements in a list?
Edit: Thank You all for give me the answer, now I understand.
O(1). Or to be more precise, it's equivalent to the big-O random access time for looking up a single index in whatever sequence you pass it, and list has O(1) random access indexing (as does tuple). Simplified, all it does is seq[random.randrange(len(seq))], which is obviously equivalent to a single index lookup operation.
An example where it would be O(n) is collections.deque, where indexing in the middle of the deque is O(n) (with a largish constant divisor though, so it's not that expensive unless the deque is reaching the thousands of elements range or higher). So basically, don't use a deque if it's going to be large and you plan to select random elements from it repeatedly, stick to list, tuple, str, byte/bytearray, array.array and other sequence types with O(1) indexing.
Though the question is about random.choice and previous answers on it have several explanations, when I searched for the complexity of np.random.choice, I didn't find an answer, so I decide to explain about np.random.choice.
choice(a, size=None, replace=True, p=None). Assume a.shape=(n,) and size=m.
When with replacement:
The complexity for np.random.choice is O(m) if p is not specified (assuming it as uniform distribution), and is O(n + n log m ) if p is specified.
The github code can be find here np.random.choice.
When p is not specified, choice generates an index array by randint and returns a[index], so the complexity is O(m). (I assume the operation of generating a random integer by randint is O(1).)
When p is specified, the function first computes prefix sum of p. Then it draws m samples from [0, 1), followed by using binary search to find a corresponding interval in the prefix sum for each drawn sample. The evidence to use binary search can be found here. So this process is O(n + m log n). If you need a faster method in this situation, you can use Alias Method, which needs O(n) time for preprocessing and O(m) time to sample m items.
When without replacement: (It's kind of complicated, and maybe I'll finish it in the future.)
If p is not specified, the complexity is the same as np.permutation(n), even when m is only 1. See more here.
If p is specified, the expected complexity is at least $n \log n \log\frac{n}{n + 1 - m}$. (This is an upperbound, but not tight.)
The complexity of random.choice(list) is O(log n) where n is the number of elements in the list.
The cpython implementation uses _randbelow(len(seq)) to get a pseudo-random index and then returns the item at that index.
The bottleneck is the _randbelow() function which uses rejection sampling to generate a number in the range [0, n). The function generates k pseudo-random bits with a call to getrandbits(k) where k is ceil(log N). These bits represent a number in the range [0, 2**k). This process is repeated until the generated number is less than n. Each call to the pseudo-random number generator runs in O(k) where k is the number of bits generated which is O(log n).
I think the above answer is incorrect. I empirically verified that the complexity of this operation is O(n). Here is my code and a little plot. I am not sure about the theory though.
from time import time
import numpy as np
import matplotlib.pyplot as plt
N = np.logspace(2, 10, 40)
output = []
for i, n in enumerate(N):
print(i)
n = int(n)
stats = time()
A = np.random.choice(list(range(n)), n//2)
output.append(time()-stats)
plt.plot(N, output)
This is the plot I got which looks quite linear to me.