Reasoning behind Quadratic Complexity in this particular code - python

I am asking in reference to this code:
array = [-37,-36,-19,-99,29,20,3,-7,-64,84,36,62,26,-76,55,-24,84,49,-65,41]
print sum(i for i in array if array.index(i) % 2 == 0)*array[-1] if array != [] else 0
You could see it here: Python code for sum with condition
Is this Quadratic because of a for loop followed by an if statement, inside the brackets?
Is this code, proposed by one more person on the same page - array[-1] * sum(array[::2]) free of Quadratic behaviour?
I think, it's again a Quadratic one, as it has to do a traversal and that too an alternate one.
Thanks in advance.

Yes, it's the array.index that makes it quadratic.
Let's first cut all irrelevant stuff away. The conditionals are for the complexity reasoning irrelevant (we will have array != [] and that check takes O(1) time). The same goes with the multiplication with array[-1]. So you're left with:
sum(i for i in array if array.index(i) % 2 == 0)
Now the inner is a generator and it would expand to an annonymous function looping through array and yielding a bunch of values, at most one per iteration. The sum function receives these values and adds them up.
The confusing thing maybe how a generator actually works. It actually works by running the generator intermixed with code from the consumer. This results in the complexity is the sum of the complexity of the generator and the consumer (ie sum). Now sum has linear complexity (it should have, it would if I wrote it).
So for the generator, it loops through the array, but for each element in the array it calls array.index which is of O(N) complexity.
To fix this you might use enumerate to avoid having to call array.index(i), it may or may not be what you want to do since array.index(i) returns the first index for which the element is i which might not be the index where you actually found i:
sum(i for idx, i in enumerate(array) if idx % 2 == 0)
To see the difference consider the list array = [0, 1, 2, 2], the first solution should sum up this to 4 since array.index(2) == 2 so it would also add the second 2. The later solution however will add up to 2 since enumerate will enumerate the elements in array, yielding pairs (0,0), (1,1), (2,2), (3,2) - where the first component is the actual index while the second is the actual element. Here the second 2 is omitted because it's actually got from index 3.

The first solution is indeed quadratic: when you call array.index you basically iterates each time on array again, so it behaves like an embedded loop.
The second solution traverses the list only one time, skipping each odd indexes.

Related

double list in return statement. need explanation in python

So I was trying to complete this kata on code wars and I ran across an interesting solution. The kata states:
"Given an array of integers, find the one that appears an odd number of times.
There will always be only one integer that appears an odd number of times."
and one of the solutions for it was:
def find_it(seq):
return [x for x in seq if seq.count(x) % 2][0]
My question is why is there [0] at the end of the statement. I tried playing around with it and putting [1] instead and when testing, it passed some tests but not others with no obvious pattern.
Any explanation will be greatly appreciated.
The first brackets are a list comprehension, the second is indexing the resulting list. It's equivalent to:
def find_it(seq):
thelist = [x for x in seq if seq.count(x) % 2]
return thelist[0]
The code is actually pretty inefficient, because it builds the whole list just to get the first value that passed the test. It could be implemented much more efficiently with next + a generator expression (like a listcomp, but lazy, with the values produced exactly once, and only on demand):
def find_it(seq):
return next(x for x in seq if seq.count(x) % 2)
which would behave the same, with the only difference being that the exception raised if no values passed the test would be IndexError in the original code, and StopIteration in the new code, and it would operate more efficiently by stopping the search the instant a value passed the test.
Really, you should just give up on using the .count method and count all the elements in a single pass, which is truly O(n) (count solutions can't be, because count itself is O(n) and must be called a number of times roughly proportionate to the input size; even if you dedupe it, in the worst case scenario all elements appear twice and you have to call count n / 2 times):
from collections import Counter
def find_it(it):
# Counter(it) counts all items of any iterable, not just sequence,
# in a single pass, and since 3.6, it's insertion order preserving,
# so you can just iterate the items of the result and find the first
# hit cheaply
return next(x for x, cnt in Counter(it).items() if cnt % 2)
That list comprehension yields a sequence of values that occur an odd number of times. The first value of that sequence will occur an odd number of times. Therefore, getting the first value of that sequence (via [0]) gets you a value that occurs an odd number of times.
Happy coding!
That code [x for x in seq if seq.count(x) % 2] return the list which has 1 value appears in input list an odd numbers of times.
So, to make the output as number, not as list, he indicates 0th index, so it returns 0th index of list with one value.
There is a nice another answer here by ShadowRanger, so I won't duplicate it providing partially only another phrasing of the same.
The expression [some_content][0] is not a double list. It is a way to get elements out of the list by using indexing. So the second "list" is a syntax for choosing an element of a list by its index (i.e. the position number in the list which begins in Python with zero and not as sometimes intuitively expected with one. So [0] addresses the first element in the list to the left of [0].
['this', 'is', 'a', 'list'][0] <-- this an index of 'this' in the list
print( ['this', 'is', 'a', 'list'][0] )
will print
this
to the stdout.
The intention of the function you are showing in your question is to return a single value and not a list.
So to get the single value out of the list which is built by the list comprehension the index [0] is used. The index guarantees that the return value result is taken out of the list [result] using [result][0] as
[result][0] == result.
The same function could be also written using a loop as follows:
def find_it(seq):
for x in seq:
if seq.count(x) % 2 != 0:
return x
but using a list comprehension instead of a loop makes it in Python mostly more effective considering speed. That is the reason why it sometimes makes sense to use a list comprehension and then unpack the found value(s) out of the list. It will be in most cases faster than an equivalent loop, but ... not in this special case where it will slow things down as mentioned already by ShadowRanger.
It seems that your tested sequences not always have only one single value which occurs an odd number of times. This will explain why you experience that sometimes the index [1] works where it shouldn't because it was stated that the tested seq will contain one and only one such value.
What you experienced looking at the function in your question is a failed attempt to make it more effective by using a list comprehension instead of a loop. The actual improvement can be achieved but by using a generator expression and another way of counting as shown in the answer by ShadowRanger:
from collections import Counter
def find_it(it):
return next(x for x, cnt in Counter(it).items() if cnt % 2)

Explain the bubble sort algorithm?

I'm trying to learn more about algorithms and i'm looking into the bubble sort algorithm. I found the script for it on github but I cant really understand it. I'm sorta new to python so can someone explain to me what's going on in this script.
from __future__ import print_function
def bubble_sort(arr):
n = len(arr)
# Traverse through all array elements
for i in range(n):
# Last i elements are already in place
for j in range(0, n-i-1):
# traverse the array from 0 to n-i-1
# Swap if the element found is greater
# than the next element
if arr[j] > arr[j+1] :
arr[j], arr[j+1] = arr[j+1], arr[j]
return arr
if __name__ == '__main__':
try:
raw_input # Python 2
except NameError:
raw_input = input # Python 3
user_input = raw_input('Enter numbers separated by a comma:').strip()
unsorted = [int(item) for item in user_input.split(',')]
print(*bubble_sort(unsorted), sep=',')
Visualize the array as a vertical list of numbers, with the first element (index 0) on the bottom, and the last element (index n-1) at the top. The idea of bubble sort is that numbers "bubble up" to the top, into the place where they belong.
For example, [2,3,1] would first look at 2 and 3, and not do anything because they're already in order. Then it would look at 3 and 1, swapping them since 3>1 and getting [2,1,3]. Then we repeat by looking at 2 and 1, swapping them since 2>1 to get [1,2,3], which is in order.
The idea is that "3" and then "2" bubbled up to the correct position.
Note that after the 3 bubbled up, we don't have to compare 2 and 3, because we know the last element is already higher than everything before it. In general, after i iterations of bubble sort, there's no need to compare the last i elements.
from __future__ import print_function Here we are essentially bringing in code that was written by somebody else, so that we may use it.
def bubble_sort(arr): This is is a function definition. A function definition is preceded by the keyword def. Following that is the function's name. In this case it is called bubble_sort. What we have in the parenthesis are called parameters. A parameter is something we give to a function, so that the function may use it, e.g., multiply the parameter by a number, sort the list, or send some information to a server.
Since we are on the topic of functions, I would suggest looking up process abstraction.
arr Here I am referring to arr within the function's definition. It is short for array, which is a list type. In python we could define an array like so fruits = ["banana", "apple", "orange"]. Arrays are useful for grouping together like pieces of information, and in python I believe this are actually known as a list type. So, conceptually, it may be easier to imagine a list rather than the more esoteric array.
n = len(arr) We are literally assigning the length of the array into the variable n. This is probably shorthand for number of elements. len(arr) is a function that takes an array/list, and returns its length. Similarly, one could call print len(arr) or simply len(arr).
for j in range(0, n-i-1): This is a bit more complicated since it requires an understanding of the algorithm in play, i.e., bubblesort. I won't explain how bubblesort works since there is probably a ton of videos online, but I will explain the bit within the parenthesis.
(0, n-i-1) We want to make comparisons between our current element and the ones preceding it. The ones preceding our current element are greater than the current element. This means if we are at element i, then we have no need to compare elements from i to n, inclusive. We subtract i from n, which leaves us with elements 0 through i. We don't need to compare i to itself, so we subtract an additional 1. This is due to j cycling through the array, and being potentially the same as i.
if arr[j] > arr[j+1] : This is a conditional statement, also known as a branching statement, or an if-statement. The condition, arr[j] > arr[j+1], is true with the element at position j is greater than the one at j+1.
arr[j], arr[j+1] = arr[j+1], arr[j] I think this is shorthand for swapping. A simple swap is shown below.
temp = arr[j]
arr[j] = arr[j+1]
arr[j+1] = temp
return arr Returns the sorted array.
The last bit I am not familiar with since I don't use python much. Perhaps this could be some research for you.
Hopefully this helps.

What is the time complexity in my single number code

The question is "Given an array of integers, every element appears twice except for one. Find that single one.
Note:
Your algorithm should have a linear runtime complexity. Could you implement it without using extra memory?"
My code is below:
def singleNumber(nums):
for i in range(len(nums)):
if nums.count(nums[i]) == 1:
return nums[i]
Why my code is not O(N)? did it calculate by for loop, which takes n rounds?
Thanks.
We can do this using an xor command:
a = [0,0,1,1,2,2,3,3,4,5,5,6,6,7,7,8,8,9,9]
ans = 0
for i in a:
ans = i^ans
ans
4
This works, as the xor is effectively doing (1^1)^(2^2)^4^(5^5), and we cancel all the doubles out.
You code runs in O(N^2) because the count method runs in O(N) and it is executed inside of a for loop. Which in total gives you O(N^2).
If you want to make it run in O(N).
You can do the following:
Loop through the array and set its values as keys to a dict and the number of times it appears in the list as it key's value.
Then iterate through the key-value pairs of the dict again and look for the key whose value is 1. This is will give O(2*N) which is O(N) time complexity.

Python Non-recursive Permutation

Does anyone understand the following iterative algorithm for producing all permutations of a list of numbers?
I do not understand the logic within the while len(stack) loop. Can someone please explain how it works?
# Non-Recursion
#param nums: A list of Integers.
#return: A list of permutations.
def permute(self, nums):
if nums is None:
return []
nums = sorted(nums)
permutation = []
stack = [-1]
permutations = []
while len(stack):
index = stack.pop()
index += 1
while index < len(nums):
if nums[index] not in permutation:
break
index += 1
else:
if len(permutation):
permutation.pop()
continue
stack.append(index)
stack.append(-1)
permutation.append(nums[index])
if len(permutation) == len(nums):
permutations.append(list(permutation))
return permutations
I'm just trying to understand the code above.
As mentioned in the comments section to your question, debugging may provide a helpful way to understand what the code does. However, let me provide a high-level perspective of what your code does.
First of all, although there are no recursive calls to the function permute, the code your provided is effectively recursive, as all it does is keeping its own stack, instead of using the one provided by the memory manager of your OS. Specifically, the variable stack is keeping the recursive state, so to speak, that is passed from one recursive call to another. You could, and perhaps should, consider each iteration of the outer while loop in the permute function as a recursive call. If you do so, you will see that the outer while loop helps 'recursively' traverse each permutation of nums in a depth-first manner.
Noticing this, it's fairly easy to figure out what each 'recursive call' does. Basically, the variable permutation keeps the current permutation of nums which is being formed as while loop progresses. Variable permutations store all the permutations of nums that are found. As you may observe, permutations are updated only when len(permutation) is equal to len(nums) which can be considered as the base case of the recurrence relation that is being implemented using a custom stack. Finally, the inner while loop picks which element of nums to add to the current permutation(i.e. stored in variable permutation) being formed.
So that is about it, really. You can figure out what is exactly being done on the lines relevant to the maintenance of stack using a debugger, as suggested. As a final note, let me repeat that I, personally, would not consider this implementation to be non-recursive. It just so happens that, instead of using the abstraction provided by the OS, this recursive solution keeps its own stack. To provide a better understanding of how a proper non-recursive solution would be, you may observe the difference in recursive and iterative solutions to the problem of finding nth Fibonacci number provided below. As you can see, the non-recursive solution keeps no stack, and instead of dividing the problem into smaller instances of it(recursion) it builds up the solution from smaller solutions. (dynamic programming)
def recursive_fib(n):
if n == 0:
return 0
elif n == 1:
return 1
return recursive_fib(n-1) + recursive_fib(n-2)
def iterative_fib(n):
f_0 = 0
f_1 = 1
for i in range(3, n):
f_2 = f_1 + f_0
f_0 = f_1
f_1 = f_2
return f_1
The answer from #ilim is correct and should be the accepted answer but I just wanted to add another point that wouldn't fit as a comment. Whilst I imagine you are studying this algorithm as an exercise it should be pointed out that a better way to proceed, depending on the size of the list, may be to user itertools's permutations() function:
print [x for x in itertools.permutations([1, 2, 3])]
Testing on my machine with a list of 11 items (39m permutations) took 1.7secs with itertools.permutations(x) but took 76secs using the custom solution above. Note however that with 12 items (479m permutations) the itertools solution blows up with a memory error. If you need to generate permutations of such size efficiently you may be better dropping to native code.

Summation Combination code

Hi I am new to Python and am trying to solve a coding challenge where the objective is to create a function which takes an array of integers (both negative and positive) and a target value (also an integer) and then outputs true if any combinations of the numbers sum up to the target value, which will always be larger than any other single element in the input array. I tried to do this but horribly failed. I found someone else's code who did it in 2 lines, but I have no idea how it works and was hoping someone here could point me in the right direction. Here is the code
def subsetsum(target, arr):
if len(arr) == 0:
return target == 0
return subsetsum(target, arr[1:]) or subsetsum(target - arr[0], arr[1:])
I mean I don't even see where they sum up the numbers and compare to the target, any help will be greatly appreciated. Thanks in advance.
Their solution uses recursion to express all possible combinations of the numbers in a pretty clever way. But clever does not always mean readable or easy to understand (which all code should be) so don't feel bad for not getting it right away. Personally, I would not consider this good code.
The simplest way to think of it is to work backwards.
What happens if arr has 0 elements? Well, it's explicitly checked at the beginning of the function: it can only be a solution to the problem if target is already 0. Let's refer to this special case of subsetsum (where arr is []) as is_zero, because that's all it does.
What happens if arr has 1 element [x]? For a solution to exist, you have to get target by either including or not including x in the sum. This is done by converting the problem in to the 0-element version, as arr[1:] will be the empty list ([]). The first part of the or expression, subsetsum(target, arr[1:]), ignores x and basically asks if is_zero(target). The second part of the expression includes x in the sum by asking if is_zero(target - x). Try it with arr = [9]. You get True if is_zero(target) OR if is_zero(target - 9). Which makes sense, right? If the target is 0 or 9, then [0] or [9] are solutions to the problem, respectively.
What happens if arr has 2 elements [x, y]? Again, the problem is reduced into the 1-element-smaller version, as arr[1:] is just [y]. So, is there a solution in subsetsum(target, [y]) OR is there a solution in subsetsum(target - x, [y]). Try it with [3, 9]. The first part of the expression ignores 3 and evaluates True if target is 9 or 0. The second part includes 3 and evaluates True if target - 3 is 9 or 0 (i.e. if target is 12 or 3). Thus, it "tries" all possible combinations: [], [9], [3], and [9, 3].
subsetsum(target, [x, y, z]) is reduced to subsetsum(target, [y, z]) OR subsetsum(target - x, [y, z]), and so on...
Looking from the top down now, what happens is that for every element in arr, the question is split into two subquestions of "is there a solution if we do not include this element?" and "is there a solution if we do include this element?" That way, you end up trying all 2**len(arr) combinations.
A much cleaner (and more efficient) solution to this problem is definitely possible using Python's really sweet module itertools, but I don't have time to write it at the moment. I may try it tomorrow just for fun, though.
Edit: Here's my more readable (and probably faster) version.
import itertools
def subsetsum(target, arr):
for subset_len in range(len(arr) + 1):
if any(sum(subset) == target for subset in itertools.combinations(arr, subset_len)):
return True
return False
It simply finds all subsets, starting with length 0 (so []) and working up to length len(arr), and checks if any of them sum to target. Python is a beautiful language!
edit There's a bug in that code (depending on whether you think there is a combination of the numbers in [1] that sum up to 0). Try
subsetsum(0,[1])
> True
I explain a bit more below.
end edit
So I think the challenge in understanding the code isn't so much about Python as about understanding a recursive function.
Let's assume that the function subsetsum works for lists of length less than N. Now let's take some list of length N. If there is a subset that sums to the target it either involves the first element (arr[0]) or it doesn't. So we'll break this into two problems - checking if there is a subset that has the first element, and checking if there is a subset that doesn't.
Note that arr[1:] is an array of length N-1 starting with the second element of arr.
So if there is a subset that does not involve arr[0], then subsetsum(target,arr[1:]) will give True. If there is a subset that involves arr[0] then subsetsum(target-arr[0],arr[1:]) will return True. If neither holds then it's going to return False.
So basically whether this works for N depends on whether it works for N-1, which depends on whether it works for N-2 etc. Finally we get down to length 0. If the target is 0, at this point, it should return True. If not, the False, or at least, that's what the person writing this code thought. I disagree. Explanation of the bug below.
Personally, I would argue that there is a bug in this code. If I give it any list and a target of 0, it'll return True. The test should be that when it gets down to a length 1 list, is the target equal to that one value. The test should be if the first element is the target, and if it gets down to a length 0 list, it's failed.
def subsetsum(target, arr):
if len(arr)==0:
return False
elif arr[0]==target:
return True
else:
return subsetsum(target, arr[1:]) or subsetsum(target - arr[0], arr[1:])

Categories

Resources