Implementing binary search to find index? - python

So I understand conceptually how binary search works, but I always have problems with implementing it when trying to find an index in an array. For instance, for the Search Insert Position on LC, this is what I wrote:
def searchInsert(self, nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: int
"""
if target > nums[-1]:
return len(nums)
low = 0
high = len(nums) - 1
while high > low:
mid = (low + high) // 2
if nums[mid] == target:
return mid
elif nums[mid] > target:
high = mid
else:
low = mid + 1
return low
It works, but I don't understand why I have to update low as mid + 1 instead of updating low as mid. Similarly, why am I updating high as mid instead of mid - 1. I've tried updating low/high as every combination of mid, mid - 1, and mid + 1 and the above is the only one that works but I have no idea why.
When implementing binary search for these kinds of problems, is there a way to reason through how you update the low/high values?

This is personal favorite:
while high >= low:
mid = (low + high) // 2
if nums[mid] >= target:
high = mid - 1
else:
low = mid + 1
return low
# or return nums[low] == target for boolean
It has difference in the case that has same values.
for example, Let's assume the array is [1,1,2,2,3,3,3,3,4].
with your function, search(arr, 1) returned 1 BUT search(arr, 2) returned 2.
why does it returned most RIGHT index on the interval 1s, and returned most LEFT index on 2s?
As i think, the key is at if nums[mid] >= target:.
when it finds the target exactly same one, the range changes by high = mid - 1. it means high might not be answer because the answer we found is mid. [1]
At the last step of binary search, the range is going to close to zero. and finally loop breaks by they crossed. thus, the answer must be low or high. but we know high is not an answer at [1].

Related

Find max element using recursion

I am new to recursion and the task is to find the POSITION of largest element in the array using recursion. This is my code:
def calc(low , high):
print(low, high)
if low == high:
return low
max_1 = calc(low , low +high//2)
max_2 = calc(low + high//2 , high)
if a[max_1] > a[max_2]:
return max_1
a = [4,3,6,1,9]
print(calc(0 , len(a)))
What am I doing wrong?
While google gives me solutions for finding the max element in array none of them have solutions for finding position of max element. Thanks in advance.
You are almost there. Two tiny mistakes are:
Base case should be low + 1 == high
Mid point should be (low + high) // 2
def calc(low , high):
if low + 1 == high:
return low
max_1 = calc(low , (low + high) // 2)
max_2 = calc((low + high) // 2 , high)
if a[max_1] > a[max_2]:
return max_1
else:
return max_2
a = [4,3,6,1,9]
print(calc(0 , len(a)))
## 4
Your solution generates infinite recursion due to the wrong base case and the mid-point.
When low == 0 and high == 1, since low != high you trigger two calls
max_1 = calc(low , low + high // 2)
max_2 = calc(low + high // 2 , high)
which are evaluated to
max_1 = calc(0, 0) ## This got returned to 0, because low == high
max_2 = calc(0, 1) ## Notice here again low == 0 and high == 1
The second call max_2 = calc(0, 1) triggers again another two calls one of which is again max_2 = calc(0, 1). This triggers infinite recursions that never returns back to max_2 and max_2 will never get evaluated and thus neither the lines after it (if a[max_1] > a[max_2]: ... ).
That is why you should check for base case low + 1 == high instead of low == high. Now you could test yourself and guess if the following code will generate infinite recursion or not. Will this time max_2 gets returned value assigned to it and the lines after it get evaluated?
def calc(low , high):
if low + 1 == high: # Here is changed from your solution
return low
max_1 = calc(low , low + high // 2) # Here is same to your solution
max_2 = calc(low + high // 2 , high) # Here is same as well
if a[max_1] > a[max_2]:
return max_1
else:
return max_2
If you get the answer right, you are half way in understanding your mistake. Then you can play with different mid-point and print at each level of recursion to see how that affects results and get a full understanding.
I think this is what you are trying to do. You should pass list slices to the function - this makes it much simpler than trying to pass low and high indices, and avoids accessing the list as a global variable - and add the midpoint to the resulting index that comes from the right hand side of the list.
def idxmax(l):
if len(l) == 1:
return 0
midpoint = len(l) // 2
a = idxmax(l[:midpoint])
b = idxmax(l[midpoint:]) + midpoint
if l[a] >= l[b]:
return a
else:
return b
print(idxmax([4,3,6,1,9]))
This returns the index of the first occurrence of the maximum, e.g. idxmax([4,9,3,6,1,9]) == 1
If you want to implement it by passing indices instead of slices (possibly more efficient by not making multiple copies of the list), you could do it like this:
def idxmax(l, start=0, end=None):
if end is None:
end = len(l) - 1
if end == start:
return start
midpoint = (start + end) // 2
a = idxmax(l, start, midpoint)
b = idxmax(l, midpoint + 1, end)
if l[a] >= l[b]:
return a
else:
return b
print(idxmax([4,3,6,1,9]))
I believe the task was to find the POSITION of the max number only (and not the value itself).
So, the function starts by comparing the last value with the max value of the list and returns the length of the list (minus one) as position if True. then it is recursively called to a shorter list and compared again until it left with only one value in the list
def calc(lst):
if lst[len(lst) - 1] == max(lst):
return len(lst) - 1
if len(lst) > 1:
return calc(lst[:-1])
print(calc([0, 1, 2, 3, 4, 5, 6])) # 6
print(calc([7, 1, 2, 3, 4, 5, 6])) # 0
print(calc([0, 1, 8, 3, 4, 5, 6])) # 2

Binary Search on Sorted List with Duplicates

To learn divide-and-conquer algorithms, I am implementing a function in Python called binary_search that will get the index of the first occurrence of a number in a non-empty, sorted list (elements of the list are non-decreasing, positive integers). For example, binary_search([1,1,2,2,3,4,4,5,5], 4) == 5, binary_search([1,1,1,1,1], 1) == 0, and binary_search([1,1,2,2,3], 5) == -1, where -1 means the number cannot be found in the list.
Below is my solution. Although the solution below passed all the tests I created manually it failed test cases from a black box unit tester. Could someone let me know what's wrong with the code below?
def find_first_index(A,low,high,key):
if A[low] == key:
return low
if low == high:
return -1
mid = low+(high-low)//2
if A[mid]==key:
if A[mid-1]==key:
return find_first_index(A,low,mid-1,key)
else:
return mid
if key <A[mid]:
return find_first_index(A,low,mid-1,key)
else:
return find_first_index(A, mid+1, high,key)
def binary_search(keys, number):
index = find_first_index(A=keys, low=0,high=len(keys)-1,key=number)
return(index)
This should work:
def find_first_index(A, low, high, key):
if A[low] == key:
return low
if low == high:
return -1
mid = low + (high - low) // 2
if A[mid] >= key:
return find_first_index(A, low, mid, key)
return find_first_index(A, mid + 1, high, key)
def binary_search(keys, number):
return find_first_index(keys, 0, len(keys) - 1, number)
Your solution does not work, as you have already realized. For example, it breaks with the following input:
>>> binary_search([1, 5], 0)
...
RecursionError: maximum recursion depth exceeded in comparison
As you can see, the function does not even terminate, there's an infinite recursion going on here. Try to "run" your program on a piece of paper to understand what's going on (or use a debugger), it's very formative.
So, what's the error? The problem is that starting from some function call high < low. In this specific case, in the first function call low == 0 and high == 1. Then mid = 0 (because int(low + (high - low) / 2) == 0). But then you call find_first_index(A, low, mid - 1, key), which is basically find_first_index(A, 0, -1, key). The subsequent call will be exactly the same (because with low == 0 and high == -1 you will have again mid == 0). Therefor, you have an infinite recursion.
A simple solution in this case would be to have
if low >= high:
return -1
Or just use my previous solution: checking mid - 1 in my opinion is not a good idea, or at least you must be much more careful when doing that.

Why does binary search algorithm not work?

I've copied code from the book "grokking algorithms" for finding an item in the list using the binary search algorithm. The code launches great and returns no errors. However, sometimes it just doesn't work for certain numbers, for example if i call it to find "1". What's wrong?
def binary_search(list, item):
low = 0
high = len(list) - 1
while low <= high:
mid = (low + high)/2
guess = list[mid]
if guess == item:
return mid
if guess > item:
high = mid + 1
else:
low = mid + 1
return None
list1 = []
for i in range (1, 101):
list1.append(i)
print(list1)
print(binary_search(list1, 1))
Two issues:
Use integer division (so it will work in Python 3): mid = (low + high)//2
When guess > item you want to exclude the guessed value from the next range, but with high = mid + 1 it still is in the range. Correct to high = mid - 1

What is the runtime for this particular algorithm?

I am thinking this particular code is (log n)^2 because each findindex function takes logn depth and we are calling it logn times? Can someone confirm this?
I hope one of you can think of this as a small quiz and help me with it.
Given a sorted array of n integers that has been rotated an unknown
number of times, write code to find an element in the array. You may
assume that the array was originally sorted in increasing order.
# Ex
# input find 5 in {15,16,19,20,25,1,3,4,5,7,10,14}
# output 8
# runtime(log n)
def findrotation(a, tgt):
return findindex(a, 0, len(a)-1, tgt, 0)
def findindex(a, low, high, target, index):
if low>high:
return -1
mid = int((high + low) / 2)
if a[mid] == target:
index = index + mid
return index
else:
b = a[low:mid]
result = findindex(b, 0, len(b)-1, target, index)
if result == -1:
index = index + mid + 1
c = a[mid+1:]
return findindex(c, 0, len(c)-1, target, index)
else:
return result
This algorithm is supposed to be O(logn) but is not from implementation perspectives.
In your algorithm, you're not making decision either to go for left subarray or right subarray only, you're trying with both subarray which is O(N).
You're doing slicing on array a[low:mid] and a[mid + 1:] which is O(n).
Which makes your overall complexity O(n^2) in worst case.
Assuming there is no duplicates in the array, an ideal implementation in Python 3 of O(logn) binary search looks like this -
A=[15,16,19,20,25,1,3,4,5,7,10,14]
low = 0
hi = len(A) - 1
def findindex(A, low, hi, target):
if low > hi:
return -1
mid = round((hi + low) / 2.0)
if A[mid] == target:
return mid
if A[mid] >= A[low]:
if target < A[mid] and target >= A[low]:
return findindex(A, low, mid - 1, target)
else :
return findindex(A, mid + 1, hi, target)
if A[mid] < A[low]:
if target < A[mid] or target >= A[low]:
return findindex(A, low, mid - 1, target)
else :
return findindex(A, mid + 1, hi, target)
return -1
print(findindex(A, low, hi, 3))

Binary Search Algorithm with interval

I am trying to change my code so instead of finding a specific value of the array it will output the value of an interval if found, example being 60-70. Any help is appreciated.
def binary (array, value):
while len(array)!= 0:
mid = len(array) // 2
if value == array[mid]:
return value
elif value > array[mid]:
array = array[mid+1:]
elif value < array [mid]:
array = array[0:mid]
sequence = [1,2,5,9,13,42,69,123,256]
print( "found", binary(sequence,70) )
I have this so far and want it to find an specified interval, so if i specify 60-70 it will find what is in between.
Actually this is pretty simple:
While searching for the elements in the interval (lower, upper), perform a binary search on the array arr for the index of the smallest element arr[n], such that arr[n] >= lower and the index of the largest element arr[m], such that arr[m] <= upper.
Now there are several possibilities:
n < m: there exist multiple solutions in the array. All of the are in the subarray starting at index n up to index m inclusively
n = m: there exists precisely one solution: arr[n]
n > m: no solutions exist
Searching for values beyond a certain threshold can be done using binary search like this:
def lowestGreaterThan(arr, threshold):
low = 0
high = len(arr)
while low < high:
mid = math.floor((low + high) / 2)
print("low = ", low, " mid = ", mid, " high = ", high)
if arr[mid] == threshold:
return mid
elif arr[mid] < threshold and mid != low:
low = mid
elif arr[mid] > threshold and mid != high:
high = mid
else:
# terminate with index pointing to the first element greater than low
high = low = low + 1
return low
Sorry bout the looks of the code, my python is far from perfect. Anyways, this ought to show the basic idea behind the approach. The algorithm basically searches for the index ind of the first element in the array with the property arr[ind] >= threshold.

Categories

Resources