My binary search seems to stuck in an infinite loop as it does not output anything for my test data. The second_list is in sorted order and im checking for duplicates from the first list and then adding those duplicates to an empty list. My binary search algorithm performs till there is one element left which I then check is the same as the item, and I would like to keep this structure. I believe the problem is that the right pointer is not decreasing however I cant see why that is.
def detect_duplicates_binary(first_list, second_list):
duplicates_list = []
for item in first_list:
left = 0
right = len(second_list) - 1
while left < right:
midpoint = (left + right) // 2
if second_list[midpoint] > item:
right = midpoint - 1
else:
left = midpoint
if (second_list[left] == item):
duplicates_list.append(item)
return duplicates_list
Pathology
Here's a possible execution sequence that would result in an infinite loop:
>>> left = 10
>>> right = 11
>>> left < right
True
>>> midpoint = (left + right) // 2
>>> midpoint
10
>>> left = midpoint
>>> left
10
>>> # No change!
Suggestion 1
Factor-out just the bisect function and get it tested separately from from the "detect duplicates" code which is a separate algorithm.
Suggestion 2
Use Python's own bisect module -- it is already tested and documented.
Here's a simplified version of its code:
def bisect_right(a, x):
"""Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e <= x, and all e in
a[i:] have e > x. So if x already appears in the list, a.insert(x) will
insert just after the rightmost x already there.
"""
lo = 0
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if x < a[mid]: hi = mid
else: lo = mid+1
return lo
Suggestion 3
Increase the right index by 1:
right = len(second_list)
Hope this helps out. Good luck :-)
Related
I am working on a hard but stupid bisect search problem and debugging for hours.
Find Minimum in Rotated Sorted Array II
Find Minimum in Rotated Sorted Array II
Hard
Suppose an array sorted in ascending order is rotated at some pivot unknown to you beforehand.
(i.e., [0,1,2,4,5,6,7] might become [4,5,6,7,0,1,2]).
Find the minimum element.
The array may contain duplicates.
Example 1:
Input: [1,3,5]
Output: 1
Example 2:
Input: [2,2,2,0,1]
Output: 0
Note:
This is a follow up problem to Find Minimum in Rotated Sorted Array.
Would allow duplicates affect the run-time complexity? How and why?
The widely accepted answer takes O(n) time,
class SolutionK:
def findMin(self, nums):
lo, hi = 0, len(nums)-1
while lo < hi:
mid = (hi +lo) // 2
if nums[mid] > nums[hi]:
lo = mid + 1
elif nums[mid] < nums[hi]:
hi = mid
else:
hi -= 1
return nums[lo]
# why not min(nums) or brute force
I think the problem might be solved by a recycled array.
Since there are duplicates, we can find the rightmost max, then max + 1 is the minimal.
#the mid
lo = 0
hi = len(nums)
mid = (lo+hi) // 2
mid = mid % len(nums)
and the terminating condition
if nums[mid-1] <= nums[mid] > nums[mid+1]: return mid as the peak.
Unfortunately I cannot design the decreasing conditions.
Could you please give some hints?
You can indeed use bisection. In case the array consists of only unique numbers and has been rotated either the leftmost or the rightmost value will be out of order with respect to the middle point. That is array[0] <= array[len(array) // 2] <= array[-1] will be False. On the other hand this condition may hold if:
the array is not rotated at all,
or there are duplicates such as [1, 1, 1, 1, 2] => (rotate left 1) [1, 1, 1, 2, 1].
So we can separately check the left and right part of the condition (array[0] and array[-1] respectively) and in case one of them is invalidated check the corresponding sub-array (the left and right sub-array respectively). In case neither condition is invalidated we need to check both sides and compare.
The following is an example implementation (it only uses min where there are less than three elements involved, i.e. a simple comparison could be made as well):
def minimum(array):
if len(array) <= 2:
return min(array)
midpoint = len(array) // 2
if array[0] > array[midpoint]:
return minimum(array[:midpoint+1])
elif array[midpoint] > array[-1]:
return minimum(array[midpoint+1:])
else: # Possibly dealing with duplicates.
return min(minimum(array[:midpoint]),
minimum(array[midpoint:]))
from collections import deque
from random import randint, choices
for test in range(1000):
l = randint(10, 100)
array = deque(sorted(choices(list(range(l // 2)), k=l)))
array.rotate(randint(-len(array), len(array)))
array = list(array)
assert min(array) == minimum(array)
I am trying to implement peak finding algorithm in Python 2.7. The program intends to find the index of the peak element. A peak element is defined as an element that is not smaller than its immediate neighbors(In case of first and last elements,only one side is checked).My code always prints "None" irrespective of the input. Please look in the code:
def peak(L,l,r,n):
if l<r:
m = l + (r-l)//2
if L[m] < L[m+1] and m < n: # n is the length of the array L
return peak(L,m+1,r,n)
elif m > 0 and L[m] < L[m-1]: # l and r are left and right bounds of
return peak(L,l,m-1,n) # the array
else:
return m
You are recursively increasing l or decreasing r, but since yout first if has no else it will at some point return None.
Your code is equivalent to
def peak(L,l,r,n):
if l<r:
... #recursion
else:
return None
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
A array of length t has all elements initialized by 1 .Now we can perform two types of queries on the array
to replace the element at ith index to 0 .This query is denoted by 0 index
find and print an integer denoting the index of the kth 1 in array A on a new line; if no such index exists print -1.This query is denoted by 1 k
Now suppose for array of length t=4 all its elements at the beginning are [1,1,1,1] now for query 0 2 the array becomes [1,0,1,1] and for query 1 3 the output comes out to be 4
I have used a brute force approach but how to make the code more efficient?
n,q=4,2
arr=[1]*4
for i in range(q):
a,b=map(int,input().split())
if a==0:
arr[b-1]=0
else:
flag=True
count=0
target=b
for i,j in enumerate(arr):
if j ==1:
count+=1
if count==target:
print(i+1)
flag=False
break
if flag:
print(-1)
I have also tried to first append all the indexes of 1 in a list and then do binary search but pop 0 changes the indices due to which the code fails
def binary_search(low,high,b):
while(low<=high):
mid=((high+low)//2)
#print(mid)
if mid+1==b:
print(stack[mid]+1)
return
elif mid+1>b:
high=mid-1
else:
low=mid+1
n=int(input())
q=int(input())
stack=list(range(n))
for i in range(q):
a,b=map(int,input().split())
if a==0:
stack.pop(b-1)
print(stack)
else:
if len(stack)<b:
print(-1)
continue
else:
low=0
high=len(stack)-1
binary_search(low,high,b)
You could build a binary tree where each node gives you the number of ones that are below and at the left of it. So if n is 7, that tree would initially look like this (the actual list with all ones is shown below it):
4
/ \
2 2
/ \ / \
1 1 1 1
----------------
1 1 1 1 1 1 1 -
Setting the array element at index 4 (zero-based) to 0, would change that tree to:
4
/ \
2 1*
/ \ / \
1 1 0* 1
----------------
1 1 1 1 0*1 1 -
Putting a 0 thus represents a O(log(n)) time complexity.
Counting the number of ones can then also be done in the same time complexity by summing up the node values while descending down the tree in the right direction.
Here is Python code you could use. It represents the tree in a list in breadth-first order. I have not gone to great lengths to further optimise the code, but it has the above time complexities:
class Ones:
def __init__(self, n): # O(n)
self.lst = [1] * n
self.one_count = n
self.tree = []
self.size = 1 << (n-1).bit_length()
at_left = self.size // 2
width = 1
while width <= at_left:
self.tree.extend([at_left//width] * width)
width *= 2
def clear_index(self, i): # O(logn)
if i >= len(self.lst) or self.lst[i] == 0:
return
self.one_count -= 1
self.lst[i] = 0
# Update tree
j = 0
bit = self.size >> 1
while bit >= 1:
go_right = (i & bit) > 0
if not go_right:
self.tree[j] -= 1
j = j*2 + 1 + go_right
bit >>= 1
def get_index_of_ith_one(self, num_ones): # O(logn)
if num_ones <= 0 or num_ones > self.one_count:
return -1
j = 0
k = 0
bit = self.size >> 1
while bit >= 1:
go_right = num_ones > self.tree[j]
if go_right:
k |= bit
num_ones -= self.tree[j]
j = j*2 + 1 + go_right
bit >>= 1
return k
def is_consistent(self): # Only for debugging
# Check that list can be derived by calling get_index_of_ith_one for all i
lst = [0] * len(self.lst)
for i in range(1, self.one_count+1):
lst[self.get_index_of_ith_one(i)] = 1
return lst == self.lst
# Example use
ones = Ones(12)
print('tree', ones.tree)
ones.clear_index(5)
ones.clear_index(2)
ones.clear_index(1)
ones.clear_index(10)
print('tree', ones.tree)
print('lst', ones.lst)
print('consistent = ', ones.is_consistent())
Be aware that this treats indexes as zero-based, while the method get_index_of_ith_one expects an argument that is at least 1 (but it returns a zero-based index).
It should be easy to adapt to your needs.
Complexity
Creation: O(n)
Clear at index: O(logn)
Get index of one: O(logn)
Space complexity: O(n)
Let's start with some general tricks:
Check if the n-th element is too big for the list before iterating. If you also keep a "counter" that stores the number of zeros, you could even check if nth >= len(the_list) - number_of_zeros (not sure if >= is correct here, it seems like the example uses 1-based indices so I could be off-by-one). That way you save time whenever too big values are used.
Use more efficient functions.
So instead of input you could use sys.stdin.readline (note that it will include the trailing newline).
And, even though it's probably not useful in this context, the built-in bisect module would be better than the binary_search function you created.
You could also use for _ in itertools.repeat(None, q) instead of for i in range(q), that's a bit faster and you don't need that index.
Then you can use some more specialized facts about the problem to improve the code:
You only store zeros and ones, so you can use if j to check for ones and if not j to check for zeros. These will be a bit faster than manual comparisons especially in when you do that in a loop.
Every time you look for the nth 1, you could create a temporary dictionary (or a list) that contains the encountered ns + index. Then re-use that dict for subsequent queries (dict-lookup and list-random-access is O(1) while your search is O(n)). You could even expand it if you have subsequent queries without change in-between.
However if a change happens you either need to discard that dictionary (or list) or update it.
A few nitpicks:
The variable names are not very descriptive, you could use for index, item in enumerate(arr): instead of i and j.
You use a list, so arr is a misleading variable name.
You have two i variables.
But don't get me wrong. It's a very good attempt and the fact that you use enumerate instead of a range is great and shows that you already write pythonic code.
Consider something akin to the interval tree:
root node covers the entire array
children nodes cover left and right halves of the parent range respectively
each node holds the number of ones in its range
Both replace and search queries could be completed in logarithmic time.
Refactored with less lines, so more efficient in terms of line count but run time probably the same O(n).
n,q=4,2
arr=[1]*4
for i in range(q):
query, target = map(int,input('query target: ').split())
if query == 0:
arr[target-1] = 0
else:
count=0
items = enumerate(arr, 1)
try:
while count < target:
index, item = next(items)
count += item
except StopIteration as e:
index = -1
print(index)
Assumes arr contains ONLY ones and zeroes - you don't have to check if an item is one before you add it to count, adding zero has no affect.
No flags to check, just keep calling next on the enumerate object (items) till you reach your target or the end of arr.
For runtime efficiency, using an external library but basically the same process (algorithm):
import numpy as np
for i in range(q):
query, target = map(int,input('query target: ').split())
if query == 0:
arr[target-1] = 0
else:
index = -1
a = np.array(arr).cumsum() == target
if np.any(a):
index = np.argmax(a) + 1
print(index)
I have problems trying to stagger a 5x5 array. First I caused a null line to go to the last line of the array (it worked), then I tried to make a line that had the highest index stay below the one with a smaller index, but in the line:
if pivot_index[i] > pivot_index[line_aux] and line_aux < 5 and i < 5:
of the code, the compiler warns that the list index is out of range, but I do not know why (that's the problem), or how to solve it. The algorithm below follows:
import numpy as np
def search_pivot(L):
if (np.nonzero(L)[0]).size == 0:
return -1
else:
return np.nonzero(L)[1][0]
def find_pivot_index(mat):
pivot = []
for i in range(5):
pivot.append(search_pivot(np.array(mat[i])))
return pivot
mat = np.matrix([[0,5,2,7,8],[0,0,4,14,16],[0,0,0,0,0],[2,6,10,16,22],[3,5,8,9,15]]).astype(float)
print("Original array:\n",mat,"\n")
pivot_index = find_pivot_index(mat)
line_aux = 0
for i in range(5):
line_aux = line_aux + 1
if pivot_index[i] > pivot_index[line_aux] and line_aux < 5 and i < 5:
m = mat.tolist()
(m[i],m[linha_aux]) = (m[linha_aux],m[i])
mat = np.matrix(m)
pivot_index = find_pivot_index(mat)
print(mat,"\n")
line_aux = 0
for i in range(5):
line_aux = line_aux + 1
if pivot_index[i] == -1 and line_aux < 5 and i < 5:
m = mat.tolist()
(m[i],m[linha_aux]) = (m[linha_aux],m[i])
mat = np.matrix(m)
pivot_index = find_pivot_index(mat)
print(mat)
The and operator in python is a short-circuit boolean operator
This means that it will only proceed to evaluate the part to the right of and if the left side is True; if the left side is False, since this determines the outcome of the boolean operation completely, the right part does not get evaluated. This allows a programmer to perform a test on the left-hand side, before proceeding to a more 'risky' evaluation that could result in an error.
In your code, you have the test coming after the risky operation. You're checking if linha_aux (whatever that means) is less than 5 after you've attempted to index pivos_indices with a linha_aux of 5 already. (this happens when i = 4, since the first line of your loop is to increment linha_aux.
Therefore:
To "simply" avoid the 'out of index' error, put tests before risky operations:
if line_aux < 5 and i < 5 and pivot_index[i] > pivot_index[line_aux]:
You might want to consider if in fact you intended to increment linha_aux at the end of your loop rather than at the start, if that makes more sense to your algorithm; remember python arrays are 0-indexed
I have decide to learn python recently! I want to write an easy merge sort using the following code :
def mergeSort(lst):
l = len(lst)
if l <= 0:
print("empty")
return None
elif l == 1:
return lst
half = int(l / 2)
m = lst[half]
print(half, m)
left = []
right = []
for n in lst:
if n < m:
left.append(n)
else:
right.append(n)
left = mergeSort(left)
right = mergeSort(right)
return merge(left, right)
Unfortunately this code generates a infinite loop, when it has to deal with a list such as [1 1 1]. Can you suggest some way to fix this wrong behavior?
Have you checked out http://www.geekviewpoint.com/? It's probably the best way to learn how to write algorithms in Python the easy way. Check it out. As a bonus it's a very clean website where the only advertisement I have seen recently is about an android brainy puzzle app by axdlab called "Puzz!". The site itself has all sorts of algorithms and good explanations.
Here is their merge sort:
#=======================================================================
# Author: Isai Damier
# Title: Mergesort
# Project: geekviewpoint
# Package: algorithm.sorting
#
# Statement:
# Given a disordered list of integers (or any other items),
# rearrange the integers in natural order.
#
# Sample Input: [8,5,3,1,9,6,0,7,4,2,5]
#
# Sample Output: [0,1,2,3,4,5,5,6,7,8,9]
#
# Time Complexity of Solution:
# Best = Average = Worst = O(nlog(n)).
#
# Approach:
# Merge sort is a divide and conquer algorithm. In the divide and
# conquer paradigm, a problem is broken into pieces where each piece
# still retains all the properties of the larger problem -- except
# its size. To solve the original problem, each piece is solved
# individually; then the pieces are merged back together.
#
# For illustration, imagine needing to sort an array of 200 elements
# using selection sort. Since selection sort takes O(n^2), it would
# take about 40,000 time units to sort the array. Now imagine
# splitting the array into ten equal pieces and sorting each piece
# individually still using selection sort. Now it would take 400
# time units to sort each piece; for a grand total of 10400 = 4000.
# Once each piece is sorted, merging them back together would take
# about 200 time units; for a grand total of 200+4000 = 4,200.
# Clearly 4,200 is an impressive improvement over 40,000. Now
# imagine greater. Imagine splitting the original array into
# groups of two and then sorting them. In the end, it would take about
# 1,000 time units to sort the array. That's how merge sort works.
#
# NOTE to the Python experts:
# While it might seem more "Pythonic" to take such approach as
#
# mid = len(aList) / 2
# left = mergesort(aList[:mid])
# right = mergesort(aList[mid:])
#
# That approach take too much memory for creating sublists.
#=======================================================================
def mergesort( aList ):
_mergesort( aList, 0, len( aList ) - 1 )
def _mergesort( aList, first, last ):
# break problem into smaller structurally identical pieces
mid = ( first + last ) / 2
if first < last:
_mergesort( aList, first, mid )
_mergesort( aList, mid + 1, last )
# merge solved pieces to get solution to original problem
a, f, l = 0, first, mid + 1
tmp = [None] * ( last - first + 1 )
while f <= mid and l <= last:
if aList[f] < aList[l] :
tmp[a] = aList[f]
f += 1
else:
tmp[a] = aList[l]
l += 1
a += 1
if f <= mid :
tmp[a:] = aList[f:mid + 1]
if l <= last:
tmp[a:] = aList[l:last + 1]
a = 0
while first <= last:
aList[first] = tmp[a]
first += 1
a += 1
Here is the testbench:
import unittest
from algorithms import sorting
class Test( unittest.TestCase ):
def testMergesort( self ):
A = [8, 5, 3, 1, 9, 6, 0, 7, 4, 2, 5]
sorting.mergesort( A )
for i in range( 1, len( A ) ):
if A[i - 1] > A[i]:
self.fail( "mergesort method fails." )
I believe you're just supposed to divide the list in half at the midpoint - not sort which items go into each half.
So instead of this:
left = []
right = []
for n in lst:
if n < m:
left.append(n)
else:
right.append(n)
just do this:
left = lst[:half]
right = lst[half:]
The algorithm you implemented is (a flawed) quick sort without removing the so-called "pivot" element, in your case m.
The merge operation you do does not need to do any merging as in merge sort, because the a call to mergeSort(left) would already return a sorted left, if you were to handle the pivot correctly.
In merge sort, you don't have a pivot element m, instead, you just halve the list into two parts, as described by James.
As a rule of thumb, recursive calls should always operate on smaller sets of data.