Implementing modified binary search in python - python

The problem is to find the index of the element which is less than or equal to N. To tackle the problem, I wrote the following code but it seems to be not working.
def find_index(primes, N, start, end):
mid = int((start + end)/2)
if start == end:
return start
if primes[mid - 1] < N:
if primes[mid] == N:
return mid
elif primes[mid] > N:
return mid - 1
else:
return find_index(primes, N, start, mid + 1)
elif primes[mid - 1] > N:
if primes[mid] > N:
return find_index(primes, N, mid - 1, end)
What obvious condition am I missing? Is there any better method to find the index in O(log(n))?

If you have a list of size 2 or larger, divide and conquer:
def find_index(primes, N, start, end):
mid = int((start + end)/2)
if start == end:
return start
if end - start == 1:
return end if primes[end] < N else start
if primes[mid] == N:
return mid
elif primes[mid] < N:
return find_index(primes, N, mid, end)
else:
return find_index(primes, N, start, mid)
The logic here being that choosing sublists via midpoint will eventually yield a difference between start and end of 1. This can be assumed for the following reasons:
Since (start + end)/2 = floor((start + end)/2), the midpoint will eventually be 1 index away from the upper boundary.
A list of size 2 (e.g. start/end 3/4 or 2/3) will always yield a list of size 2 since the midpoint will be the lower bound. This is evident given (n + n + 1)/2 = (2n + 1)/2 => 2n/2 = n.
Once a difference of 1 has been reached, the top and bottom can be checked for satisfying the requirement of being less than N.
Completely empty list case not handled.

Related

First occurrence of a positive integer using binary search

I've tried to write a code that finds the first occurrence of a positive integer in a sorted array using binary search, but it doesn't work.
Here's the code:
def findFirstOccurrence(arr):
(left, right) = (0, len(arr) - 1)
result = -1
while left <= right:
mid = (left + right) // 2
if 0 < arr[mid]:
result = mid
right = mid - 1
elif 0 > arr[mid]:
right = mid - 1
else:
left = mid + 1
return result
Below code will help.
def findFirstOccurrence(arr):
left,right=0,len(arr)-1
while not((left==right) or (right==left+1)):
mid=(left+right)//2
if arr[mid]>0:
right=mid
else:
left=mid
if arr[left]>0:
return left
if arr[right]>0:
return right
return None
print(findFirstOccurrence([-4,-3,-2,-1,0,1,2,5,3,4,6]))
Output
5
The error comes from the lines
elif 0 > arr[mid]:
right = mid - 1
If arr[mid] is negative, we decrease the right pointer — i.e. we look further to the left. But if the array is sorted in ascending order, then anything further to the left of a negative number is still negative, so we'll never reach a positive number and the program will fail.
What we want to do instead is look to the right, where the positive numbers are. The lines
else:
left = mid + 1
already do that in the case that arr[mid] == 0. Removing the two erroneous lines would allow the case that arr[mid] < 0 to fall through and do what we want.
Final code:
if arr[mid] > 0:
result = mid
right = mid - 1
else:
left = mid + 1

Recursive binary search algorithm doesn't stop executing after condition is met, returns a nonetype object

I have made a binary search algorithm, biSearch(A, high, low, key). It takes in a sorted array and a key, and spits out the position of key in the array. High and low are the min and max of the search range.
It almost works, save for one problem:
On the second "iteration" (not sure what the recursive equivalent of that is), a condition is met and the algorithm should stop running and return "index". I commented where this happens. Instead, what ends up happening is that the code continues on to the next condition, even though the preceding condition is true. The correct result, 5, is then overridden and the new result is a nonetype object.
within my code, I have commented in caps the problems at the location in which they occur. Help is much appreciated, and I thank you in advance!
"""
Created on Sat Dec 28 18:40:06 2019
"""
def biSearch(A, key, low = False, high = False):
if low == False:
low = 0
if high == False:
high = len(A)-1
if high == low:
return A[low]
mid = low + int((high -low)/ 2)
# if key == A[mid] : two cases
if key == A[mid] and high - low == 0: #case 1: key is in the last pos. SHOULD STOP RUNNING HERE
index = mid
return index
elif key == A[mid] and (high - low) > 0:
if A[mid] == A[mid + 1] and A[mid]==A[mid -1]: #case 2: key isnt last and might be repeated
i = mid -1
while A[i] == A[i+1]:
i +=1
index = list(range(mid- 1, i+1))
elif A[mid] == A[mid + 1]:
i = mid
while A[i]== A[i+1]:
i += 1
index = list(range(mid, i+1))
elif A[mid] == A[mid -1]:
i = mid -1
while A[i] == A[i +1]:
i += 1
index = list(range(mid, i +1))
elif key > A[mid] and high - low > 0: # BUT CODE EXECTUES THIS LINE EVEN THOUGH PRECEDING IS ALREADY MET
index = biSearch(A, key, mid+1, high)
elif key < A[mid] and high - low > 0:
index = biSearch(A, key, low, mid -1)
return index
elif A[mid] != key: # if key DNE in A
return -1
#biSearch([1,3,5, 4, 7, 7,7,9], 1, 8, 7)
#x = biSearch([1,3,5, 4, 7,9], 1, 6, 9)
x = biSearch([1,3,5, 4, 7,9],9)
print(x)
# x = search([1,3,5, 4, 7,9], 9)
This function is not a binary search. Binary search's time complexity should be O(log(n)) and works on pre-sorted lists, but the complexity of this algorithm is at least O(n log(n)) because it sorts its input parameter list for every recursive call. Even without the sorting, there are linear statements like list(range(mid, i +1)) on each call, making the complexity quadratic. You'd be better off with a linear search using list#index.
The function mutates its input parameter, which no search function should do (we want to search, not search and sort).
Efficiencies and mutation aside, the logic is difficult to parse and is overkill in any circumstance. Not all nested conditionals lead to a return, so it's possible to return None by default.
You can use the builtin bisect module:
>>> from bisect import *
>>> bisect_left([1,2,2,2,2,3,4,4,4,4,5], 2)
1
>>> bisect_left([1,2,2,2,2,3,4,4,4,4,5], 4)
6
>>> bisect_right([1,2,2,2,2,3,4,4,4,4,5], 4)
10
>>> bisect_right([1,2,2,2,2,3,4,4,4,4,5], 2)
5
>>> bisect_right([1,2,2,2,2,3,4,4,4,4,5], 15)
11
>>> bisect_right([1,2,5,6], 3)
2
If you have to write this by hand as an exercise, start by looking at bisect_left's source code:
def bisect_left(a, x, lo=0, hi=None):
"""Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e < x, and all e in
a[i:] have e >= x. So if x already appears in the list, a.insert(x) will
insert just before the leftmost x already there.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
# Use __lt__ to match the logic in list.sort() and in heapq
if a[mid] < x: lo = mid+1
else: hi = mid
This is easy to implement recursively (if desired) and then test against the builtin:
def bisect_left(a, target, lo=0, hi=None):
if hi is None: hi = len(a)
mid = (hi + lo) // 2
if lo >= hi:
return mid
elif a[mid] < target:
return bisect_left(a, target, mid + 1, hi)
return bisect_left(a, target, lo, mid)
if __name__ == "__main__":
from bisect import bisect_left as builtin_bisect_left
from random import choice, randint
from sys import exit
for _ in range(10000):
a = sorted(randint(0, 100) for _ in range(100))
if any(bisect_left(a, x) != builtin_bisect_left(a, x) for x in range(-1, 101)):
print("fail")
exit(1)
Logically, for any call frame, there's only 3 possibilities:
The lo and hi pointers have crossed, in which case we've either found the element or figured out where it should be if it were in the list; either way, return the midpoint.
The element at the midpoint is less than the target, which guarantees that the target is in the tail half of the search space, if it exists.
The element at the midpoint matches or is less than the target, which guarantees that the target is in the front half of the search space.
Python doesn't overflow integers, so you can use the simplified midpoint test.

What is the runtime for this particular algorithm?

I am thinking this particular code is (log n)^2 because each findindex function takes logn depth and we are calling it logn times? Can someone confirm this?
I hope one of you can think of this as a small quiz and help me with it.
Given a sorted array of n integers that has been rotated an unknown
number of times, write code to find an element in the array. You may
assume that the array was originally sorted in increasing order.
# Ex
# input find 5 in {15,16,19,20,25,1,3,4,5,7,10,14}
# output 8
# runtime(log n)
def findrotation(a, tgt):
return findindex(a, 0, len(a)-1, tgt, 0)
def findindex(a, low, high, target, index):
if low>high:
return -1
mid = int((high + low) / 2)
if a[mid] == target:
index = index + mid
return index
else:
b = a[low:mid]
result = findindex(b, 0, len(b)-1, target, index)
if result == -1:
index = index + mid + 1
c = a[mid+1:]
return findindex(c, 0, len(c)-1, target, index)
else:
return result
This algorithm is supposed to be O(logn) but is not from implementation perspectives.
In your algorithm, you're not making decision either to go for left subarray or right subarray only, you're trying with both subarray which is O(N).
You're doing slicing on array a[low:mid] and a[mid + 1:] which is O(n).
Which makes your overall complexity O(n^2) in worst case.
Assuming there is no duplicates in the array, an ideal implementation in Python 3 of O(logn) binary search looks like this -
A=[15,16,19,20,25,1,3,4,5,7,10,14]
low = 0
hi = len(A) - 1
def findindex(A, low, hi, target):
if low > hi:
return -1
mid = round((hi + low) / 2.0)
if A[mid] == target:
return mid
if A[mid] >= A[low]:
if target < A[mid] and target >= A[low]:
return findindex(A, low, mid - 1, target)
else :
return findindex(A, mid + 1, hi, target)
if A[mid] < A[low]:
if target < A[mid] or target >= A[low]:
return findindex(A, low, mid - 1, target)
else :
return findindex(A, mid + 1, hi, target)
return -1
print(findindex(A, low, hi, 3))

Backtracing the longest palindromic subsequence

I modified the code from Geeks for Geeks to backtrace the actual subsequence, not only its length. But when I backtrace and get to the end where I can put an arbitrary character to the middle of the palindrome, I find my solution to be sloppy and not 'Pythonic'. Can someone please help me?
This piece smells particularly bad(if it works correctly at all):
if length_matrix[start][end] == 1 and substr_length >= 0:
middle = sequence[start]
Here is the forward pass:
def calc_subsequence_lengths(sequence):
n = len(sequence)
# Create a table to store results of subproblems
palindrome_lengths = np.zeros((n, n))
# Strings of length 1 are palindrome of length 1
np.fill_diagonal(palindrome_lengths, 1)
for substr_length in range(2, n + 1):
for i in range(n - substr_length + 1):
j = i + substr_length - 1
if sequence[i] == sequence[j] and substr_length == 2:
palindrome_lengths[i][j] = 2
elif sequence[i] == sequence[j]:
palindrome_lengths[i][j] = palindrome_lengths[i + 1][j - 1] + 2
else:
palindrome_lengths[i][j] = max(palindrome_lengths[i][j - 1],
palindrome_lengths[i + 1][j])
return palindrome_lengths
And here is the traceback:
def restore_palindrome(length_matrix, sequence):
palindrome_left = ''
middle = ''
n, n = np.shape(length_matrix)
# start in the north-eastern corner of the matrix
substr_length, end = n - 1, n-1
# traceback
while substr_length > 0 and end > 1:
start = end - substr_length
# if possible, go left
if length_matrix[start][end] == (length_matrix[start][end - 1]):
substr_length -= 1
end -= 1
# the left cell == current - 2, but the lower is the same as current, go down
elif length_matrix[start][end] == (length_matrix[start + 1][end]):
substr_length -= 1
# both left and lower == current - 2, go south-west
else:
palindrome_left += sequence[start]
substr_length -= 2
end -= 1
if length_matrix[start][end] == 1 and substr_length >= 0:
middle = sequence[start+1]
result = ''.join(palindrome_left) + middle + ''.join(palindrome_left[::-1])
return result, int(length_matrix[0][n-1])
Update
First off, the problem is to calculate the longest non-contiguous palindromic sequence (as stated in the article I referred to). For the sequence BBABCBCAB, the output should be BABCBAB
Secondly, as I have pointed out, I'm building upon an existing DP solution which works in O(N^2) time and space. It calculates the length just fine, so I need to backtrace the actual palindrome in the most elegant way, not sacrificing efficiency for elegance.

Binary Search in Python 2 variable not updating causing an infinite loop.....why is the variable not updating?

This is my binary search. The mid does not update and it loops infinitely.
def binary_search (z, A, start, end):
if len(A) == 0:
return None
else:
mid = start + (end - start) / 2
if (z < A[mid]) and (z > A[mid-1]):
return A[mid-1]
elif (z < A[mid]):
return binary_search(z, A, start, mid)
elif (z > A[mid]):
return binary_search(z, A, mid, end)
def binary_search (z, A, start, end):
if end < start:
return None
else:
mid = start + (end - start) / 2
if (z < A[mid]):
return binary_search(z, A, start, mid-1)
elif (z > A[mid]):
return binary_search(z, A, mid+1, end)
else:
return mid
I changed a couple of things around.
I check first changed end < start: because if len(A) == 0: will stay constant and won't allow you to use it as a base case.
Also, you when you a returning the binary search you need to skip the mid value, because that's the one you are returning.
I tested the code and it works!

Categories

Resources