Time and Space Complexity Trouble - python

I've seen so many time complexity problems but none seem to aid in my understanding of it - like really get it.
What I have taken from my readings and attempts at practices all seems to come down to what was mentioned here Determining complexity for recursive functions (Big O notation) in the answer coder gave - which in fact did help me understand a little more about what's going on for time complexity.
What about a function that such as this:
def f(n):
if n < 3:
return n
if n >= 3:
return f(n-1) + 2*f(n-2) + 3*f(n-3)
Since the function calls the function 3 times, does that mean that the time complexity is O(3^n)?
As for the space complexity, it seems to be linear hence I propose the complexity to be O(n).
Am I wrong about this?

Since the function calls the function 3 times
This isn't really correct, but rather lets use examples that are more exact that your ad-hoc example.
def constant(n):
return n*12301230
This will always run in the same amount of time and is therefore O(1)
def linear(n):
total = 0
for x in xrange(n):
total+=1
return total
This has O(N) time
def quadratic(n):
total = 0
for x in xrange(n):
for y in xrange(n):
total+=1
return total
This runs in quadratic time O(N^2) since the inner loop runs n times and the outer loop runs n times.
There are also more specific examples for log(N), N*log(N), (2^N), etc but, going back to your question:
Since the function calls the function 3 times, does that mean that the time complexity is O(3^n)?
If the function is called 3 times, it will still be constant time for constant(x), linear for linear(x) and quadratic for quadratic(x). Importantly, O(3^n) is exponential time and is not the same as n^3. Then, we would not use a 3 as the base but a 2^n as a standard.
So your function will have a constant time for x<3. Best approximation to what your function gives, I'd run it through a timer but its recursive and difficult to compute. If you provide another, non-recursive example I'll be happy to tell you its complexity.
Hope this helps a bit, the graph doesn't do justice to how much faster 2^n grows in comparison to n^2 but it's a good start.

Related

what approach is best to decrease the time complexity of this problem

I want to preface this thread by stating I am still learning the basics of data structures and algorithms I'm not looking for the correct code for this problem but rather what the correct approach is. So that I can learn what situations call for which data structure. That being said I am now going to try and correctly explain this code.
The code below is a solution I had written for a medium-level leetcode problem. Please see the link to read the problem
Correct me if I am wrong, currently the time complexity of this algorithn is O(n)
class Solution:
def canCompleteCircuit(self, gas: List[int], cost: List[int]):
startingStation = 0
didCircuit = -1
tank = 0
i = 0
while i <= len(gas):
if startingStation == len(gas):
return -1
if startingStation == i:
didCircuit += 1
if didCircuit == 1:
return startingStation
tank += gas[i] - cost[i]
if tank >= 0:
i += 1
if i == len(gas):
i = 0
if tank < 0:
didCircuit = -1
startingStation += 1
i = startingStation
tank = 0
The code works fine but the time complexity is too slow to iterate through each test case. What I am asking is if this algorithm is O(n) what approach could I have used to make the runtime complexity of this algorithm O(log(n)) or just faster?
side question - I know having a lot of if statements is bad and ugly code but if all of the iterations are O(1) does the amount of if statements have any impact on the performance of this function if scaled to a high iteration count?
"Correct me if I am wrong, currently the time complexity of this algorithn is O(n)"
This algorithm is O(n^2) rather than O(n). In the best case, it will return an answer in only "n" iterations of the while loop, but in the situation where there is no answer, it needs to run the loop (n*(n+1))/2 times.
O() notation tells us to ignore practical values of n and remove terms that become insignificant as n grows very large. So we ignore the +n and the /2 in the iterations, with the most significant component being the n^2.
So it is an O(n^2) algorithm.
"if all of the iterations are O(1) does the amount of if statements have any impact on the performance of this function if scaled to a high iteration count"
No, the O() of the algorithm is not impacted by the number of logic statements, but beware of hidden loops and expensive operations. For example, a logic statement of if x in list can be O(n) on the number of items in the list without data-specific optimizations, so if you have an O(n) loop around it (for the same list) you could have an O(n^2) algorithm. None of your logic statements have this issue, you can ignore them for O() purposes.
Assignments can be treated the same.
"What I am asking is if this algorithm is O(n) what approach could I have used to make the runtime complexity of this algorithm O(log(n)) or just faster?"
Since the algorithm is not O(n), better to ask how you might get there. You can get there by finding a way to not have to loop over the arrays more than once.
You ask about data structures, but you talk about time complexity.
The best algorithm in this case is O(n) in time, and O(1) in additional space. It requires you to store one integer in addition to the two arrays. You can even implement it with three integers of storage if you keep reading the gas and cost values from streams of data.
"I'm not looking for the correct code for this problem but rather what the correct approach is"
They've given you a gift with the statement that any success solution is unique. From this we know that the amount of gas available is no more than the sum of all costs plus the smallest difference between a station's cost and gas. If it were otherwise, then there would two points in the loop where you could start.
That means that as soon as we find an i where the sum of the gas available at stations 0 to i exceeds the cost of travel from 0 to i we have found the unique starting position. If we get to the end of the line and have not found this, we know it is impossible to do so for any starting position.

Find runtime (number of operations of function) and calculate Big O

For the python function given below, I have to find the number of Operations and Big O.
def no_odd_number(list_nums):
i = 0
while i < len(list_nums):
num = list_nums[i]
if num % 2 != 0:
return False
i += 1
return True
From my calculation, the number of operations is 4 + 3n but I'm not sure as I don't know how to deal with if...else statements.
I am also given options to choose the correct Big O from, from my calculation, I think it should be d. O(n) but I'm not sure. Help please!
a. O(n^2)
b. O(1)
c. O(log n)
d. O(n)
e. None of these
Big O notation typically considers the worst case scenario. The function you have is pretty simple, but the early return seems to complicate things. However, since we care about the worst case you can ignore the if block. The worst case will be one where you don't return early. It would be a list like [2,4,6,8], which would run the loop four times.
Now, look at the things inside the while loop, with the above in mind. It doesn't matter how big list_nums is: inside the loop you just increment i and lookup something in a list. Both of those are constant time operations that are the same regardless of how large list_nums is.
The number of times you do this loop is the length of list_nums. This means as list_nums grows, the number of operations grows at the same rate. That makes this O(n) as you suspect.

What is the difference in time complexity between these two blocks of code (if any) and why?

Trying to solidify my knowledge about Time Complexity. I think I know the answer to this, but would like to hear some good explanations.
main = []
while len(main) < 5:
sub = []
while len(sub) < 5:
sub.append(random.randint(1,10))
main.append(sub)
VS
main = []
sub = []
while len(main) < 5:
sub.append(random.randint(1,10))
if len(sub) == 5:
main.append(list(sub))
sub = []
There's no difference, since the time complexity is constant in both cases - you perform a constant amount of operations both times.
The time complexity in both are O(1) - constant time because they both perform a constant number of operations as #Yakov Dan already stated.
This is because time complexity is usually expressed as a function of a variable number(say n) and tends to show how changing the value of n will change the time the algorithm will take.
Now, assuming you had n instead of 5, then you would have O(n^2) for both cases. It may be tricky for the second case since a basic way of checking the polynomial complexity is to count the number of nested loops and you can be lead to conclude that the second version is O(n) since it has a single loop.
However, carefully looking at it will show you that the loop runs n(5 in this case) times for sub for each value appended to main, so it is essentially the same.
This of course assumes that the in-built list.append is atomic or runs in a constant time.

What is optimal algorithm to check if a given integer is equal to sum of two elements of an int array?

def check_set(S, k):
S2 = k - S
set_from_S2=set(S2.flatten())
for x in S:
if(x in set_from_S2):
return True
return False
I have a given integer k. I want to check if k is equal to sum of two element of array S.
S = np.array([1,2,3,4])
k = 8
It should return False in this case because there are no two elements of S having sum of 8. The above code work like 8 = 4 + 4 so it returned True
I can't find an algorithm to solve this problem with complexity of O(n).
Can someone help me?
You have to account for multiple instances of the same item, so set is not good choice here.
Instead you can exploit dictionary with value_field = number_of_keys (as variant - from collections import Counter)
A = [3,1,2,3,4]
Cntr = {}
for x in A:
if x in Cntr:
Cntr[x] += 1
else:
Cntr[x] = 1
#k = 11
k = 8
ans = False
for x in A:
if (k-x) in Cntr:
if k == 2 * x:
if Cntr[k-x] > 1:
ans = True
break
else:
ans = True
break
print(ans)
Returns True for k=5,6 (I added one more 3) and False for k=8,11
Adding onto MBo's answer.
"Optimal" can be an ambiguous term in terms of algorithmics, as there is often a compromise between how fast the algorithm runs and how memory-efficient it is. Sometimes we may also be interested in either worst-case resource consumption or in average resource consumption. We'll loop at worst-case here because it's simpler and roughly equivalent to average in our scenario.
Let's call n the length of our array, and let's consider 3 examples.
Example 1
We start with a very naive algorithm for our problem, with two nested loops that iterate over the array, and check for every two items of different indices if they sum to the target number.
Time complexity: worst-case scenario (where the answer is False or where it's True but that we find it on the last pair of items we check) has n^2 loop iterations. If you're familiar with the big-O notation, we'll say the algorithm's time complexity is O(n^2), which basically means that in terms of our input size n, the time it takes to solve the algorithm grows more or less like n^2 with multiplicative factor (well, technically the notation means "at most like n^2 with a multiplicative factor, but it's a generalized abuse of language to use it as "more or less like" instead).
Space complexity (memory consumption): we only store an array, plus a fixed set of objects whose sizes do not depend on n (everything Python needs to run, the call stack, maybe two iterators and/or some temporary variables). The part of the memory consumption that grows with n is therefore just the size of the array, which is n times the amount of memory required to store an integer in an array (let's call that sizeof(int)).
Conclusion: Time is O(n^2), Memory is n*sizeof(int) (+O(1), that is, up to an additional constant factor, which doesn't matter to us, and which we'll ignore from now on).
Example 2
Let's consider the algorithm in MBo's answer.
Time complexity: much, much better than in Example 1. We start by creating a dictionary. This is done in a loop over n. Setting keys in a dictionary is a constant-time operation in proper conditions, so that the time taken by each step of that first loop does not depend on n. Therefore, for now we've used O(n) in terms of time complexity. Now we only have one remaining loop over n. The time spent accessing elements our dictionary is independent of n, so once again, the total complexity is O(n). Combining our two loops together, since they both grow like n up to a multiplicative factor, so does their sum (up to a different multiplicative factor). Total: O(n).
Memory: Basically the same as before, plus a dictionary of n elements. For the sake of simplicity, let's consider that these elements are integers (we could have used booleans), and forget about some of the aspects of dictionaries to only count the size used to store the keys and the values. There are n integer keys and n integer values to store, which uses 2*n*sizeof(int) in terms of memory. Add to that what we had before and we have a total of 3*n*sizeof(int).
Conclusion: Time is O(n), Memory is 3*n*sizeof(int). The algorithm is considerably faster when n grows, but uses three times more memory than example 1. In some weird scenarios where almost no memory is available (embedded systems maybe), this 3*n*sizeof(int) might simply be too much, and you might not be able to use this algorithm (admittedly, it's probably never going to be a real issue).
Example 3
Can we find a trade-off between Example 1 and Example 2?
One way to do that is to replicate the same kind of nested loop structure as in Example 1, but with some pre-processing to replace the inner loop with something faster. To do that, we sort the initial array, in place. Done with well-chosen algorithms, this has a time-complexity of O(n*log(n)) and negligible memory usage.
Once we have sorted our array, we write our outer loop (which is a regular loop over the whole array), and then inside that outer loop, use dichotomy to search for the number we're missing to reach our target k. This dichotomy approach would have a memory consumption of O(log(n)), and its time complexity would be O(log(n)) as well.
Time complexity: The pre-processing sort is O(n*log(n)). Then in the main part of the algorithm, we have n calls to our O(log(n)) dichotomy search, which totals to O(n*log(n)). So, overall, O(n*log(n)).
Memory: Ignoring the constant parts, we have the memory for our array (n*sizeof(int)) plus the memory for our call stack in the dichotomy search (O(log(n))). Total: n*sizeof(int) + O(log(n)).
Conclusion: Time is O(n*log(n)), Memory is n*sizeof(int) + O(log(n)). Memory is almost as small as in Example 1. Time complexity is slightly more than in Example 2. In scenarios where the Example 2 cannot be used because we lack memory, the next best thing in terms of speed would realistically be Example 3, which is almost as fast as Example 2 and probably has enough room to run if the very slow Example 1 does.
Overall conclusion
This answer was just to show that "optimal" is context-dependent in algorithmics. It's very unlikely that in this particular example, one would choose to implement Example 3. In general, you'd see either Example 1 if n is so small that one would choose whatever is simplest to design and fastest to code, or Example 2 if n is a bit larger and we want speed. But if you look at the wikipedia page I linked for sorting algorithms, you'll see that none of them is best at everything. They all have scenarios where they could be replaced with something better.

How to make this Python for-loop run faster?

for j in range(0, NumberOfFeatures):
for k in range(j+1,NumberOfFeatures):
countArray = np.ones((2,2))
for i in range(0,NumberOfTrainingExamples):
countArray[XTrain[i,j],XTrain[i,k]] += 1
The innermost for loop takes quite some time for large NumberOfFeatures, NumberOfTrainingExamples
Its O(n^3) basically (where n is not the same number).
Because the code is not complete it is very hard to determine what could be done better but from what you provided, try to reduce it to at least n^2 otherwise it will just take some time.
If you have 10 of each its 1000 cycles, 1000 of each is 1,000,000,000 so with bigger numbers it gets hard to calculate very fast.

Categories

Resources