Majority element using divide&conquer - python

I want to find the majority element from a list using divide & conquer algorithm.
I saw this code on Leetcode with this solution:
class Solution:
def majorityElement(self, nums, lo=0, hi=None):
def majority_element_rec(lo, hi):
# base case; the only element in an array of size 1 is the majority
# element.
if lo == hi:
return nums[lo]
# recurse on left and right halves of this slice.
mid = (hi-lo)//2 + lo
left = majority_element_rec(lo, mid)
right = majority_element_rec(mid+1, hi)
# if the two halves agree on the majority element, return it.
if left == right:
return left
# otherwise, count each element and return the "winner".
left_count = sum(1 for i in range(lo, hi+1) if nums[i] == left)
right_count = sum(1 for i in range(lo, hi+1) if nums[i] == right)
return left if left_count > right_count else right
return majority_element_rec(0, len(nums)-1)
when there is a majority element, the result is right but when there is not such an element, it returns the wrong result.
I tried to change the return statement to:
if left_count > right_count:
return left
elif left_count < right_count:
return right
else:
return -1
so it returns -1 when there is no right answer.
When the input is [1,2,1,3] the result is -1(right answer) but when the input is [1,2,3,3] the output is 3 which is wrong.
It seems that everyone use this solution but it isn't working. Any ideas about how to fix it?
TIA

I think the recursive step is OK since if there is a majority element, it must be the majority element of at least one of the halves of the array, and the recursive step will find it. If there is no majority element the recursive algorithm will still return a (wrong) candidate.
If you want the program to detect if the element is indeed the majority element,
I would just change the final step.
def majorityElement(self, nums):
...
candidate = majority_element_rec(0, len(nums)-1)
if nums.count(candidate) > len(nums)//2:
return count
else:
return -1
Alternatively the recursive step could perform the same check whether the found candidate is indeed a majority element by replacing the last line:
return left if left_count > right_count else right
with
majority = ((hi - lo) + 1) / 2
if left_count > majority:
return left
if right_count > majority:
return right
return -1
This should still work because the majority element if it exists must be the majority element in one half of the array recursively and must thus still propagate.

Related

Time complexity of finding range of target in sorted array - Is this solution O(N) in the worst case?

I was going through LeetCode problem 34. Find First and Last Position of Element in Sorted Array, which says:
Given an array of integers nums sorted in non-decreasing order, find the starting and ending position of a given target value.
If target is not found in the array, return [-1, -1].
You must write an algorithm with O(log n) runtime complexity.
Since the question wanted logn run-time, I implemented the binary-search logic. But I am not sure, and think that, with the extra-while loop inside the base condition, I actually go to O(n) in the worst case. Is that true?
class Solution(object):
def searchRange(self, nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
left = 0
right = len(nums) - 1
pos = [-1,-1]
while left <= right:
middle = (left + right) // 2
"""
This is pure binary search until we hit the target. Once
we have hit the target, we expand towards left and right
until we find the number equal to the target.
"""
if nums[middle] == target:
rIndex = middle
while rIndex + 1 < len(nums) and nums[rIndex + 1] == target:
rIndex += 1
pos[1] = rIndex
lIndex = middle
while lIndex - 1 >= 0 and nums[lIndex - 1] == target:
lIndex -= 1
pos[0] = lIndex
break
elif target > nums[middle]:
left = middle + 1
else:
right = middle - 1
return pos
Here is what I think for an example array that looks like:
input = [8,8,8,8,8,8,8] , target = 8
When the base condition nums[middle] == target hits, I will need to iterate the complete array and this makes it run-time complexity as O(n), right?
Interestingly, this solution is faster than 95% of the submissions!! But I think there is some issue with LeetCode!!!
Yes, you are right, the loop degrades the worst case time complexity. You rightly identified what happens when the input array has only duplicates of the target value, and no other value.
The solution is to perform two binary searches: one that prefers going to the left side, and one that prefers to go to the right side of the target value.
If the test cases do not thoroughly test this O(n) behaviour, this O(n) solution will not come out as a bad one.

Ceiling of the element in sorted array

Hi I am doing DSA problems and found a problem called as ceiling of the element in sorted array. In this problem there is a sorted array and if the target element is present in the sorted array return the target. If the target element is not found in the sorted array we need to return the smallest element which is greater than target. I have written the code and also done some test cases but need to check if everything works correctly. This problem is not there on leetcode where I could run it with many different cases. Need suggestion/feedback if the problem is solved in the correct way and if it would give correct results in all cases
class Solution:
#My approch
def smallestNumberGreaterThanTarget(self, nums, target):
start = 0
end = len(nums)-1
if target > nums[end]:
return -1
while start <= end:
mid = start + (end-start)//2
if nums[mid] == target:
return nums[mid]
elif nums[mid] < target:
if nums[mid+1] >= target:
return nums[mid+1]
start = mid + 1
else:
end = mid-1
return nums[start]
IMO, the problem can be solved in a simpler way, with only one test inside the main loop. The figure below shows a partition of the real line, in subsets associated to the values in the array.
First, we notice that for all values above the largest, there is no corresponding element, and we will handle this case separately.
Now we have exactly N subsets left, and we can find the right one by a dichotomic search among these subsets.
if target > nums[len(nums)-1]:
return None
s, e= 0, len(nums);
while e > s:
m= e + ((s - e) >> 1);
if target > nums[m]:
s= m+1
else:
e= m
return s
We can formally prove the algorithm using the invariant nums[s-1] < target <= nums[e], with the fictional convention nums[-1] = -∞. In the end, we have the bracketing nums[s-1] < target <= nums[s].
The code errors out with an index out-of-range error for the empty list (though this may not be necessary because you haven't specified the problem constraints).
A simple if guard at the top of the function can fix this:
if not nums:
return -1
Otherwise, it seems fine to me. But if you're still not sure whether or not your algorithm works, you can always do random testing (e.g. create a linear search version of the algorithm and then randomly generate inputs to both algorithms, and then see if there's any difference).
Here's a one-liner that you can test against:
input_list = [0, 1, 2, 3, 4]
target = 0
print(next((num for num in input_list if num >= target), -1))

Biased Binary Sort Return First Index of Insertion

I am trying to make a way of returning the first possible location where I can insert a new term in an increasing list. My general attempt is to use binary sort until the condition arises that the term before is less than my inserting term, and the next term is equal to or greater than my inserting term:
lis = [1,1,2,2,2,4,5,6,7,8,9]
def binary_sort(elem, lis, left, right):
mid = (left + right)//2
if (lis[mid-1] < elem and elem <= lis[mid]):
return mid
elif (lis[mid] > mid):
return binary_sort(elem, lis, mid, right)
else:
return binary_sort(elem, lis, left, mid)
This is not working... where is the issue with my code or my general strategy?
I would suggest to take a look at the following code.
def binary_insertion_search(elem, items, left, right):
if elem > items[right]:
return right + 1
while left < right:
mid = (left + right) // 2
if elem > items[mid]:
left = mid + 1
elif elem <= items[mid]:
right = mid
return left
I rewrote your function a little bit. Now for the explanation:
I assumed that the index to return is the first position that any content can be placed at, which in turn would move all following items to the right.
Since we can not incorporate indices outside of the range of the list by design, we have to check if the element is larger than the item at the end of the list. We then return the next possible index len(lis).
To avoid recursion alltogether, I used a while loop. First, we check, whether left is equal to or greater than right. This can only be true, if we have found the index to put the element at.
Your calculation of the mid value was good, so I just kept it.
If our element is greater than the item at mid, the only possible next option is to select the next unchecked position, which is mid + 1.
Now for the interesting part. Like in the other case, to find the leftmost item, we have to set the right boundary to mid - 1, in order to skip the mid element. However, we check if the element is smaller or equal to the item at mid.
This guarantees us that when we find a candidate that is equal to the searched element, we run the search again (with reduced ranged from right) to possibly find a smaller index. This stops, when left == right is true, ending the loop.
I hope this answers your question and points out the differences in the code!

Binary search stuck in an infinte loop python

My binary search seems to stuck in an infinite loop as it does not output anything for my test data. The second_list is in sorted order and im checking for duplicates from the first list and then adding those duplicates to an empty list. My binary search algorithm performs till there is one element left which I then check is the same as the item, and I would like to keep this structure. I believe the problem is that the right pointer is not decreasing however I cant see why that is.
def detect_duplicates_binary(first_list, second_list):
duplicates_list = []
for item in first_list:
left = 0
right = len(second_list) - 1
while left < right:
midpoint = (left + right) // 2
if second_list[midpoint] > item:
right = midpoint - 1
else:
left = midpoint
if (second_list[left] == item):
duplicates_list.append(item)
return duplicates_list
Pathology
Here's a possible execution sequence that would result in an infinite loop:
>>> left = 10
>>> right = 11
>>> left < right
True
>>> midpoint = (left + right) // 2
>>> midpoint
10
>>> left = midpoint
>>> left
10
>>> # No change!
Suggestion 1
Factor-out just the bisect function and get it tested separately from from the "detect duplicates" code which is a separate algorithm.
Suggestion 2
Use Python's own bisect module -- it is already tested and documented.
Here's a simplified version of its code:
def bisect_right(a, x):
"""Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e <= x, and all e in
a[i:] have e > x. So if x already appears in the list, a.insert(x) will
insert just after the rightmost x already there.
"""
lo = 0
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if x < a[mid]: hi = mid
else: lo = mid+1
return lo
Suggestion 3
Increase the right index by 1:
right = len(second_list)
Hope this helps out. Good luck :-)

Quicksort implementation in Python

I'm trying to implement quicksort in python. However, my code doesn't properly sort (not quite). For example, on the input array [5,3,4,2,7,6,1], my code outputs [1,2,3,5,4,6,7]. So, the end result interposes the 4 and 5. I admit I am a bit rusty on python as I've been studying ML (and was fairly new to python before that). I'm aware of other python implementations of quicksort, and other similar questions on Stack Overflow about python and quicksort, but I am trying to understand what is wrong with this chunk of code that I wrote myself:
#still broken 'quicksort'
def partition(array):
pivot = array[0]
i = 1
for j in range(i, len(array)):
if array[j] < array[i]:
temp = array[i]
array[i] = array[j]
array[j] = temp
i += 1
array[0] = array[i]
array[i] = pivot
return array[0:(i)], pivot, array[(i+1):(len(array))]
def quick_sort(array):
if len(array) <= 1: #if i change this to if len(array) == 1 i get an index out of bound error
return array
low, pivot, high = partition(array)
#quick_sort(low)
#quick_sort(high)
return quick_sort(low) + [pivot] + quick_sort(high)
array = [5,3,4,2,7,6,1]
print quick_sort(array)
# prints [1,2,3,5,4,6,7]
I'm a little confused about what the algorithm's connection to quicksort is. In quicksort, you typically compare all entries against a pivot, so you get a lower and higher group; the quick_sort function clearly expects your partition function to do this.
However, in the partition function, you never compare anything against the value you name pivot. All comparisons are between index i and j, where j is incremented by the for loop and i is incremented if an item was found out of order. Those comparisons include checking an item against itself. That algorithm is more like a selection sort with a complexity slightly worse than a bubble sort. So you get items bubbling left as long as there are enough items to the left of them, with the first item finally dumped after where the last moved item went; since it was never compared against anything, we know this must be out of order if there are items left of it, simply because it replaced an item that was in order.
Thinking a little more about it, the items are only partially ordered, since you do not return to an item once it has been swapped to the left, and it was only checked against the item it replaced (now found to have been out of order). I think it is easier to write the intended function without index wrangling:
def partition(inlist):
i=iter(inlist)
pivot=i.next()
low,high=[],[]
for item in i:
if item<pivot:
low.append(item)
else:
high.append(item)
return low,pivot,high
You might find these reference implementations helpful while trying to understand your own.
Returning a new list:
def qsort(array):
if len(array) < 2:
return array
head, *tail = array
less = qsort([i for i in tail if i < head])
more = qsort([i for i in tail if i >= head])
return less + [head] + more
Sorting a list in place:
def quicksort(array):
_quicksort(array, 0, len(array) - 1)
def _quicksort(array, start, stop):
if stop - start > 0:
pivot, left, right = array[start], start, stop
while left <= right:
while array[left] < pivot:
left += 1
while array[right] > pivot:
right -= 1
if left <= right:
array[left], array[right] = array[right], array[left]
left += 1
right -= 1
_quicksort(array, start, right)
_quicksort(array, left, stop)
Generating sorted items from an iterable:
def qsort(sequence):
iterator = iter(sequence)
try:
head = next(iterator)
except StopIteration:
pass
else:
try:
tail, more = chain(next(iterator), iterator), []
yield from qsort(split(head, tail, more))
yield head
yield from qsort(more)
except StopIteration:
yield head
def chain(head, iterator):
yield head
yield from iterator
def split(head, tail, more):
for item in tail:
if item < head:
yield item
else:
more.append(item)
If pivot ends up needing to stay in the initial position (b/c it is the lowest value), you swap it with some other element anyway.
Read the Fine Manual :
Quick sort explanation and python implementation :
http://interactivepython.org/courselib/static/pythonds/SortSearch/TheQuickSort.html
Sorry, this should be a comment, but it has too complicated structure for a comment.
See what happens for array being [7, 8]:
pivot = 7
i = 1
for loop does nothing
array[0] becomes array[i] which is 8
array[i] becomes pivot which is 7
you return array[0:1] and pivot, which are [8, 7] and 7 (the third subexpression ignored)...
If you explicitly include the returned pivot in concatenation, you should skip it in the array returned.
okay i "fixed" it, at least on the one input i've tried it on (and idk why... python issues)
def partition(array):
pivot = array[0]
i = 1
for j in range(i, len(array)):
if array[j] < pivot:
temp = array[i]
array[i] = array[j]
array[j] = temp
i += 1
array[0] = array[i-1]
array[i-1] = pivot
return array[0:i-1], pivot, array[i:(len(array))]
def quick_sort(array):
if len(array) <= 1:
return array
low, pivot, high = partition(array)
#quick_sort (low)
#quick_sort (high)
return quick_sort (low) + [pivot] + quick_sort (high)
array = [5,3,4,2,7,6,1]
print quick_sort(array)
# prints [1,2,3,4,5,6,7]

Categories

Resources