Find runtime (number of operations of function) and calculate Big O - python

For the python function given below, I have to find the number of Operations and Big O.
def no_odd_number(list_nums):
i = 0
while i < len(list_nums):
num = list_nums[i]
if num % 2 != 0:
return False
i += 1
return True
From my calculation, the number of operations is 4 + 3n but I'm not sure as I don't know how to deal with if...else statements.
I am also given options to choose the correct Big O from, from my calculation, I think it should be d. O(n) but I'm not sure. Help please!
a. O(n^2)
b. O(1)
c. O(log n)
d. O(n)
e. None of these

Big O notation typically considers the worst case scenario. The function you have is pretty simple, but the early return seems to complicate things. However, since we care about the worst case you can ignore the if block. The worst case will be one where you don't return early. It would be a list like [2,4,6,8], which would run the loop four times.
Now, look at the things inside the while loop, with the above in mind. It doesn't matter how big list_nums is: inside the loop you just increment i and lookup something in a list. Both of those are constant time operations that are the same regardless of how large list_nums is.
The number of times you do this loop is the length of list_nums. This means as list_nums grows, the number of operations grows at the same rate. That makes this O(n) as you suspect.

Related

what approach is best to decrease the time complexity of this problem

I want to preface this thread by stating I am still learning the basics of data structures and algorithms I'm not looking for the correct code for this problem but rather what the correct approach is. So that I can learn what situations call for which data structure. That being said I am now going to try and correctly explain this code.
The code below is a solution I had written for a medium-level leetcode problem. Please see the link to read the problem
Correct me if I am wrong, currently the time complexity of this algorithn is O(n)
class Solution:
def canCompleteCircuit(self, gas: List[int], cost: List[int]):
startingStation = 0
didCircuit = -1
tank = 0
i = 0
while i <= len(gas):
if startingStation == len(gas):
return -1
if startingStation == i:
didCircuit += 1
if didCircuit == 1:
return startingStation
tank += gas[i] - cost[i]
if tank >= 0:
i += 1
if i == len(gas):
i = 0
if tank < 0:
didCircuit = -1
startingStation += 1
i = startingStation
tank = 0
The code works fine but the time complexity is too slow to iterate through each test case. What I am asking is if this algorithm is O(n) what approach could I have used to make the runtime complexity of this algorithm O(log(n)) or just faster?
side question - I know having a lot of if statements is bad and ugly code but if all of the iterations are O(1) does the amount of if statements have any impact on the performance of this function if scaled to a high iteration count?
"Correct me if I am wrong, currently the time complexity of this algorithn is O(n)"
This algorithm is O(n^2) rather than O(n). In the best case, it will return an answer in only "n" iterations of the while loop, but in the situation where there is no answer, it needs to run the loop (n*(n+1))/2 times.
O() notation tells us to ignore practical values of n and remove terms that become insignificant as n grows very large. So we ignore the +n and the /2 in the iterations, with the most significant component being the n^2.
So it is an O(n^2) algorithm.
"if all of the iterations are O(1) does the amount of if statements have any impact on the performance of this function if scaled to a high iteration count"
No, the O() of the algorithm is not impacted by the number of logic statements, but beware of hidden loops and expensive operations. For example, a logic statement of if x in list can be O(n) on the number of items in the list without data-specific optimizations, so if you have an O(n) loop around it (for the same list) you could have an O(n^2) algorithm. None of your logic statements have this issue, you can ignore them for O() purposes.
Assignments can be treated the same.
"What I am asking is if this algorithm is O(n) what approach could I have used to make the runtime complexity of this algorithm O(log(n)) or just faster?"
Since the algorithm is not O(n), better to ask how you might get there. You can get there by finding a way to not have to loop over the arrays more than once.
You ask about data structures, but you talk about time complexity.
The best algorithm in this case is O(n) in time, and O(1) in additional space. It requires you to store one integer in addition to the two arrays. You can even implement it with three integers of storage if you keep reading the gas and cost values from streams of data.
"I'm not looking for the correct code for this problem but rather what the correct approach is"
They've given you a gift with the statement that any success solution is unique. From this we know that the amount of gas available is no more than the sum of all costs plus the smallest difference between a station's cost and gas. If it were otherwise, then there would two points in the loop where you could start.
That means that as soon as we find an i where the sum of the gas available at stations 0 to i exceeds the cost of travel from 0 to i we have found the unique starting position. If we get to the end of the line and have not found this, we know it is impossible to do so for any starting position.

What is the difference in time complexity between these two blocks of code (if any) and why?

Trying to solidify my knowledge about Time Complexity. I think I know the answer to this, but would like to hear some good explanations.
main = []
while len(main) < 5:
sub = []
while len(sub) < 5:
sub.append(random.randint(1,10))
main.append(sub)
VS
main = []
sub = []
while len(main) < 5:
sub.append(random.randint(1,10))
if len(sub) == 5:
main.append(list(sub))
sub = []
There's no difference, since the time complexity is constant in both cases - you perform a constant amount of operations both times.
The time complexity in both are O(1) - constant time because they both perform a constant number of operations as #Yakov Dan already stated.
This is because time complexity is usually expressed as a function of a variable number(say n) and tends to show how changing the value of n will change the time the algorithm will take.
Now, assuming you had n instead of 5, then you would have O(n^2) for both cases. It may be tricky for the second case since a basic way of checking the polynomial complexity is to count the number of nested loops and you can be lead to conclude that the second version is O(n) since it has a single loop.
However, carefully looking at it will show you that the loop runs n(5 in this case) times for sub for each value appended to main, so it is essentially the same.
This of course assumes that the in-built list.append is atomic or runs in a constant time.

What is optimal algorithm to check if a given integer is equal to sum of two elements of an int array?

def check_set(S, k):
S2 = k - S
set_from_S2=set(S2.flatten())
for x in S:
if(x in set_from_S2):
return True
return False
I have a given integer k. I want to check if k is equal to sum of two element of array S.
S = np.array([1,2,3,4])
k = 8
It should return False in this case because there are no two elements of S having sum of 8. The above code work like 8 = 4 + 4 so it returned True
I can't find an algorithm to solve this problem with complexity of O(n).
Can someone help me?
You have to account for multiple instances of the same item, so set is not good choice here.
Instead you can exploit dictionary with value_field = number_of_keys (as variant - from collections import Counter)
A = [3,1,2,3,4]
Cntr = {}
for x in A:
if x in Cntr:
Cntr[x] += 1
else:
Cntr[x] = 1
#k = 11
k = 8
ans = False
for x in A:
if (k-x) in Cntr:
if k == 2 * x:
if Cntr[k-x] > 1:
ans = True
break
else:
ans = True
break
print(ans)
Returns True for k=5,6 (I added one more 3) and False for k=8,11
Adding onto MBo's answer.
"Optimal" can be an ambiguous term in terms of algorithmics, as there is often a compromise between how fast the algorithm runs and how memory-efficient it is. Sometimes we may also be interested in either worst-case resource consumption or in average resource consumption. We'll loop at worst-case here because it's simpler and roughly equivalent to average in our scenario.
Let's call n the length of our array, and let's consider 3 examples.
Example 1
We start with a very naive algorithm for our problem, with two nested loops that iterate over the array, and check for every two items of different indices if they sum to the target number.
Time complexity: worst-case scenario (where the answer is False or where it's True but that we find it on the last pair of items we check) has n^2 loop iterations. If you're familiar with the big-O notation, we'll say the algorithm's time complexity is O(n^2), which basically means that in terms of our input size n, the time it takes to solve the algorithm grows more or less like n^2 with multiplicative factor (well, technically the notation means "at most like n^2 with a multiplicative factor, but it's a generalized abuse of language to use it as "more or less like" instead).
Space complexity (memory consumption): we only store an array, plus a fixed set of objects whose sizes do not depend on n (everything Python needs to run, the call stack, maybe two iterators and/or some temporary variables). The part of the memory consumption that grows with n is therefore just the size of the array, which is n times the amount of memory required to store an integer in an array (let's call that sizeof(int)).
Conclusion: Time is O(n^2), Memory is n*sizeof(int) (+O(1), that is, up to an additional constant factor, which doesn't matter to us, and which we'll ignore from now on).
Example 2
Let's consider the algorithm in MBo's answer.
Time complexity: much, much better than in Example 1. We start by creating a dictionary. This is done in a loop over n. Setting keys in a dictionary is a constant-time operation in proper conditions, so that the time taken by each step of that first loop does not depend on n. Therefore, for now we've used O(n) in terms of time complexity. Now we only have one remaining loop over n. The time spent accessing elements our dictionary is independent of n, so once again, the total complexity is O(n). Combining our two loops together, since they both grow like n up to a multiplicative factor, so does their sum (up to a different multiplicative factor). Total: O(n).
Memory: Basically the same as before, plus a dictionary of n elements. For the sake of simplicity, let's consider that these elements are integers (we could have used booleans), and forget about some of the aspects of dictionaries to only count the size used to store the keys and the values. There are n integer keys and n integer values to store, which uses 2*n*sizeof(int) in terms of memory. Add to that what we had before and we have a total of 3*n*sizeof(int).
Conclusion: Time is O(n), Memory is 3*n*sizeof(int). The algorithm is considerably faster when n grows, but uses three times more memory than example 1. In some weird scenarios where almost no memory is available (embedded systems maybe), this 3*n*sizeof(int) might simply be too much, and you might not be able to use this algorithm (admittedly, it's probably never going to be a real issue).
Example 3
Can we find a trade-off between Example 1 and Example 2?
One way to do that is to replicate the same kind of nested loop structure as in Example 1, but with some pre-processing to replace the inner loop with something faster. To do that, we sort the initial array, in place. Done with well-chosen algorithms, this has a time-complexity of O(n*log(n)) and negligible memory usage.
Once we have sorted our array, we write our outer loop (which is a regular loop over the whole array), and then inside that outer loop, use dichotomy to search for the number we're missing to reach our target k. This dichotomy approach would have a memory consumption of O(log(n)), and its time complexity would be O(log(n)) as well.
Time complexity: The pre-processing sort is O(n*log(n)). Then in the main part of the algorithm, we have n calls to our O(log(n)) dichotomy search, which totals to O(n*log(n)). So, overall, O(n*log(n)).
Memory: Ignoring the constant parts, we have the memory for our array (n*sizeof(int)) plus the memory for our call stack in the dichotomy search (O(log(n))). Total: n*sizeof(int) + O(log(n)).
Conclusion: Time is O(n*log(n)), Memory is n*sizeof(int) + O(log(n)). Memory is almost as small as in Example 1. Time complexity is slightly more than in Example 2. In scenarios where the Example 2 cannot be used because we lack memory, the next best thing in terms of speed would realistically be Example 3, which is almost as fast as Example 2 and probably has enough room to run if the very slow Example 1 does.
Overall conclusion
This answer was just to show that "optimal" is context-dependent in algorithmics. It's very unlikely that in this particular example, one would choose to implement Example 3. In general, you'd see either Example 1 if n is so small that one would choose whatever is simplest to design and fastest to code, or Example 2 if n is a bit larger and we want speed. But if you look at the wikipedia page I linked for sorting algorithms, you'll see that none of them is best at everything. They all have scenarios where they could be replaced with something better.

O(N) Time complexity for simple Python function

I just took a Codility demo test. The question and my answer can be seen here, but I'll paste my answer here as well. My response:
def solution(A):
# write your code in Python 2.7
retresult = 1; # the smallest integer we can return, if it is not in the array
A.sort()
for i in A:
if i > 0:
if i==retresult: retresult += 1 # increment the result since the current result exists in the array
elif i>retresult: break # we can go out of the loop since we found a bigger number than our current positive integer result
return retresult
My question is around time complexity, which I hope to better understand by your response. The question asks for expected worst-case time complexity is O(N).
Does my function have O(N) time complexity? Does the fact that I sort the array increase the complexity, and if so how?
Codility reports (for my answer)
Detected time complexity:
O(N) or O(N * log(N))
So, what is the complexity for my function? And if it is O(N*log(N)), what can I do to decrease the complexity to O(N) as the problem states?
Thanks very much!
p.s. my background reading on time complexity comes from this great post.
EDIT
Following the reply below, and the answers described here for this problem, I would like to expand on this with my take on the solutions:
basicSolution has an expensive time complexity and so is not the right answer for this Codility test:
def basicSolution(A):
# 0(N*log(N) time complexity
retresult = 1; # the smallest integer we can return, if it is not in the array
A.sort()
for i in A:
if i > 0:
if i==retresult: retresult += 1 #increment the result since the current result exists in the array
elif i>retresult: break # we can go out of the loop since we found a bigger number than our current positive integer result
else:
continue; # negative numbers and 0 don't need any work
return retresult
hashSolution is my take on what is described in the above article, in the "use hashing" paragraph. As I am new to Python, please let me know if you have any improvements to this code (it does work though against my test cases), and what time complexity this has?
def hashSolution(A):
# 0(N) time complexity, I think? but requires 0(N) extra space (requirement states to use 0(N) space
table = {}
for i in A:
if i > 0:
table[i] = True # collision/duplicate will just overwrite
for i in range(1,100000+1): # the problem says that the array has a maximum of 100,000 integers
if not(table.get(i)): return i
return 1 # default
Finally, the actual 0(N) solution (O(n) time and O(1) extra space solution) I am having trouble understanding. I understand that negative/0 values are pushed at the back of the array, and then we have an array of just positive values. But I do not understand the findMissingPositive function - could anyone please describe this with Python code/comments? With an example perhaps? I've been trying to work through it in Python and just cannot figure it out :(
It does not, because you sort A.
The Python list.sort() function uses Timsort (named after Tim Peters), and has a worst-case time complexity of O(NlogN).
Rather than sort your input, you'll have to iterate over it and determine if any integers are missing by some other means. I'd use a set of a range() object:
def solution(A):
expected = set(range(1, len(A) + 1))
for i in A:
expected.discard(i)
if not expected:
# all consecutive digits for len(A) were present, so next is missing
return len(A) + 1
return min(expected)
This is O(N); we create a set of len(A) (O(N) time), then we loop over A, removing elements from expected (again O(N) time, removing elements from a set is O(1)), then test for expected being empty (O(1) time), and finally get the smallest element in expected (at most O(N) time).
So we make at most 3 O(N) time steps in the above function, making it a O(N) solution.
This also fits the storage requirement; all use is a set of size N. Sets have a small overhead, but always smaller than N.
The hash solution you found is based on the same principle, except that it uses a dictionary instead of a set. Note that the dictionary values are never actually used, they are either set to True or absent. I'd rewrite that as:
def hashSolution(A):
seen = {i for i in A if i > 0}
if not seen:
# there were no positive values, so 1 is the first missing.
return 1
for i in range(1, 10**5 + 1):
if i not in seen:
return i
# we can never get here because the inputs are limited to integers up to
# 10k. So either `seen` has a limited number of positive values below
# 10.000 or none at all.
The above avoids looping all the way to 10.000 if there were no positive integers in A.
The difference between mine and theirs is that mine starts with the set of expected numbers, while they start with the set of positive values from A, inverting the storage and test.

Decreasing while loop working time for gigantic numbers

i = 1
d = 1
e = 20199
a = 10242248284272
while(True):
print(i)
if((i*e)%a == 1):
d = i
break
i = i + 1
Numbers are given to represent lengths of them.
I have a piece of Python code which works for really huge numbers and I have to wait for a day maybe two. How can I decrease the time needed for the code to run?
There are a variety of things you can do to speed this up; whether it will be enough is another matter.
Avoid printing things you don't have to.
Avoid multiplication. Instead of computing i*e each iteration, you could have a new variable, say i_e, which holds this value. Each time you increment i, you can add e to it.
The vast majority of your iterations aren't even close to satisfying (i*e)%a==1. If you used a larger increment for i, you could avoid testing most of them. Of course, you have to be careful not to overshoot.
Because you are doing modular arithmetic, you don't need to use the full value of i in your test (though you will still need to track it, for the output)
Combining the previous two points, you can avoid computing the modulus: If you know that a<=x<2*a, then x-a==1 is equivalent to x%a==1.
None of these may be enough to reduce the complexity of the problem, but might be enough to reduce the constant factor to speed things up enough for your purposes.

Categories

Resources