Recursive binary search in Python - python

I am trying to understand this function with little to no avail. I completely understand what a binary search is but am only new to the concept of recursion but do have a slight grasp on it. I don't really understand what the default values of low and high would be when first calling the function. As of right now I am just including the search space I know the number is in, but what if I don't or I am not sure of the list length? Otherwise, I understand the recursion process going on here as well as the need for low and high being arguments. The function below is provided in the notes by an online course I am taking; however, it wasn't explained in the lecture and contains no docstrings or references about it.
def bSearch(L, e, low, high):
if high - low < 2:
return L[low] == e or L[high] == e
mid = low + int((high-low)/2)
if L[mid] == e:
return True
if L[mid] > e:
return bSearch(L, e, low, mid-1)
else:
return bSearch(L, e, mid+1, high)
L = [1,3,6,15,34,84,78,256]
print bSearch(L, 15, 4, 8)
print bSearch(L, 84, 0, 6)
Output:
False
True

High and low appear to be indices for which part of the list to search.
In the first example, 15 has an index of 3, so specifying a lower index of 4 means the 15 isn't included in the search space. In the second example, 84 has an index of 5, so it is included in the search space spanning indices 0 and 6.
These indices are also inclusive. If the second example were:
print bSearch(L, 84, 0, 5)
the answer would be:
True
If you want to search the entire list, you can simply do:
print bSearch(L, 84, 0, len(L) - 1)
where the - 1 is necessary because the search function is inclusive.

Binary search .
bsearch(list , element to be found , start index , end index).
start index can be taken as 0 at the start of the function
and last index can be taken as len(list)-1
As in question for bsearch(L,15 , 4 , 8 ).
U are searching only between 5th and 9th element where the number is not present.
In the second function call u are searching between first element and 5 th element where a number present.
U can call this function as bsearch(L , 15 ,0 , len(L) - 1) for any other number.
Hope this helps.

low and high specify the indices of L where the algorithm must search. In the first example, 15 has the index 3. This is not in the interval [4,8] so it will return false. In the second example the index of 84 in L is 5, this is in the interval [0,6] so this will return True.
If the number you are searching for is not in L this method will return False. Why? Because you end up in the base case of if (high-low) < 2. In this case there will be checked against L[high] or L[low] being equal to the number you are searching for. If both are not the case, it returns False. This is the definition of the logical or.
False or False = False
False or True = True
True or False = True
True or True = True
If you are not sure about the list length, this will produce an error if the high or low value you provide are not in the range of L. You can add an extra condition so this can not happen, but I think that is out of the scope of that lesson. :)

Related

How can I get a sum from some elements of a list? [duplicate]

I have a list of numbers. I also have a certain sum. The sum is made from a few numbers from my list (I may/may not know how many numbers it's made from). Is there a fast algorithm to get a list of possible numbers? Written in Python would be great, but pseudo-code's good too. (I can't yet read anything other than Python :P )
Example
list = [1,2,3,10]
sum = 12
result = [2,10]
NOTE: I do know of Algorithm to find which numbers from a list of size n sum to another number (but I cannot read C# and I'm unable to check if it works for my needs. I'm on Linux and I tried using Mono but I get errors and I can't figure out how to work C# :(
AND I do know of algorithm to sum up a list of numbers for all combinations (but it seems to be fairly inefficient. I don't need all combinations.)
This problem reduces to the 0-1 Knapsack Problem, where you are trying to find a set with an exact sum. The solution depends on the constraints, in the general case this problem is NP-Complete.
However, if the maximum search sum (let's call it S) is not too high, then you can solve the problem using dynamic programming. I will explain it using a recursive function and memoization, which is easier to understand than a bottom-up approach.
Let's code a function f(v, i, S), such that it returns the number of subsets in v[i:] that sums exactly to S. To solve it recursively, first we have to analyze the base (i.e.: v[i:] is empty):
S == 0: The only subset of [] has sum 0, so it is a valid subset. Because of this, the function should return 1.
S != 0: As the only subset of [] has sum 0, there is not a valid subset. Because of this, the function should return 0.
Then, let's analyze the recursive case (i.e.: v[i:] is not empty). There are two choices: include the number v[i] in the current subset, or not include it. If we include v[i], then we are looking subsets that have sum S - v[i], otherwise, we are still looking for subsets with sum S. The function f might be implemented in the following way:
def f(v, i, S):
if i >= len(v): return 1 if S == 0 else 0
count = f(v, i + 1, S)
count += f(v, i + 1, S - v[i])
return count
v = [1, 2, 3, 10]
sum = 12
print(f(v, 0, sum))
By checking f(v, 0, S) > 0, you can know if there is a solution to your problem. However, this code is too slow, each recursive call spawns two new calls, which leads to an O(2^n) algorithm. Now, we can apply memoization to make it run in time O(n*S), which is faster if S is not too big:
def f(v, i, S, memo):
if i >= len(v): return 1 if S == 0 else 0
if (i, S) not in memo: # <-- Check if value has not been calculated.
count = f(v, i + 1, S, memo)
count += f(v, i + 1, S - v[i], memo)
memo[(i, S)] = count # <-- Memoize calculated result.
return memo[(i, S)] # <-- Return memoized value.
v = [1, 2, 3, 10]
sum = 12
memo = dict()
print(f(v, 0, sum, memo))
Now, it is possible to code a function g that returns one subset that sums S. To do this, it is enough to add elements only if there is at least one solution including them:
def f(v, i, S, memo):
# ... same as before ...
def g(v, S, memo):
subset = []
for i, x in enumerate(v):
# Check if there is still a solution if we include v[i]
if f(v, i + 1, S - x, memo) > 0:
subset.append(x)
S -= x
return subset
v = [1, 2, 3, 10]
sum = 12
memo = dict()
if f(v, 0, sum, memo) == 0: print("There are no valid subsets.")
else: print(g(v, sum, memo))
Disclaimer: This solution says there are two subsets of [10, 10] that sums 10. This is because it assumes that the first ten is different to the second ten. The algorithm can be fixed to assume that both tens are equal (and thus answer one), but that is a bit more complicated.
I know I'm giving an answer 10 years later since you asked this, but i really needed to know how to do this an the way jbernadas did it was too hard for me, so i googled it for an hour and I found a python library itertools that gets the job done!
I hope this help to future newbie programmers.
You just have to import the library and use the .combinations() method, it is that simple, it returns all the subsets in a set with order, I mean:
For the set [1, 2, 3, 4] and a subset with length 3 it will not return [1, 2, 3][1, 3, 2][2, 3, 1] it will return just [1, 2, 3]
As you want ALL the subsets of a set you can iterate it:
import itertools
sequence = [1, 2, 3, 4]
for i in range(len(sequence)):
for j in itertools.combinations(sequence, i):
print(j)
The output will be
()
(1,)
(2,)
(3,)
(4,)
(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
(3, 4)
(1, 2, 3)
(1, 2, 4)
(1, 3, 4)
(2, 3, 4)
Hope this help!
So, the logic is to reverse sort the numbers,and suppose the list of numbers is l and sum to be formed is s.
for i in b:
if(a(round(n-i,2),b[b.index(i)+1:])):
r.append(i)
return True
return False
then, we go through this loop and a number is selected from l in order and let say it is i .
there are 2 possible cases either i is the part of sum or not.
So, we assume that i is part of solution and then the problem reduces to l being l[l.index(i+1):] and s being s-i so, if our function is a(l,s) then we call a(l[l.index(i+1):] ,s-i). and if i is not a part of s then we have to form s from l[l.index(i+1):] list.
So it is similar in both the cases , only change is if i is part of s, then s=s-i and otherwise s=s only.
now to reduce the problem such that in case numbers in l are greater than s we remove them to reduce the complexity until l is empty and in that case the numbers which are selected are not a part of our solution and we return false.
if(len(b)==0):
return False
while(b[0]>n):
b.remove(b[0])
if(len(b)==0):
return False
and in case l has only 1 element left then either it can be part of s then we return true or it is not then we return false and loop will go through other number.
if(b[0]==n):
r.append(b[0])
return True
if(len(b)==1):
return False
note in the loop if have used b..but b is our list only.and i have rounded wherever it is possible, so that we should not get wrong answer due to floating point calculations in python.
r=[]
list_of_numbers=[61.12,13.11,100.12,12.32,200,60.00,145.34,14.22,100.21,14.77,214.35,200.32,65.43,0.49,132.13,143.21,156.34,11.32,12.34,15.67,17.89,21.23,14.21,12,122,134]
list_of_numbers=sorted(list_of_numbers)
list_of_numbers.reverse()
sum_to_be_formed=401.54
def a(n,b):
global r
if(len(b)==0):
return False
while(b[0]>n):
b.remove(b[0])
if(len(b)==0):
return False
if(b[0]==n):
r.append(b[0])
return True
if(len(b)==1):
return False
for i in b:
if(a(round(n-i,2),b[b.index(i)+1:])):
r.append(i)
return True
return False
if(a(sum_to_be_formed,list_of_numbers)):
print(r)
this solution works fast.more fast than one explained above.
However this works for positive numbers only.
However also it works good if there is a solution only otherwise it takes to much time to get out of loops.
an example run is like this lets say
l=[1,6,7,8,10]
and s=22 i.e. s=1+6+7+8
so it goes through like this
1.) [10, 8, 7, 6, 1] 22
i.e. 10 is selected to be part of 22..so s=22-10=12 and l=l.remove(10)
2.) [8, 7, 6, 1] 12
i.e. 8 is selected to be part of 12..so s=12-8=4 and l=l.remove(8)
3.) [7, 6, 1] 4
now 7,6 are removed and 1!=4 so it will return false for this execution where 8 is selected.
4.)[6, 1] 5
i.e. 7 is selected to be part of 12..so s=12-7=5 and l=l.remove(7)
now 6 are removed and 1!=5 so it will return false for this execution where 7 is selected.
5.)[1] 6
i.e. 6 is selected to be part of 12..so s=12-6=6 and l=l.remove(6)
now 1!=6 so it will return false for this execution where 6 is selected.
6.)[] 11
i.e. 1 is selected to be part of 12..so s=12-1=1 and l=l.remove(1)
now l is empty so all the cases for which 10 was a part of s are false and so 10 is not a part of s and we now start with 8 and same cases follow.
7.)[7, 6, 1] 14
8.)[6, 1] 7
9.)[1] 1
just to give a comparison which i ran on my computer which is not so good.
using
l=[61.12,13.11,100.12,12.32,200,60.00,145.34,14.22,100.21,14.77,214.35,145.21,123.56,11.90,200.32,65.43,0.49,132.13,143.21,156.34,11.32,12.34,15.67,17.89,21.23,14.21,12,122,134]
and
s=2000
my loop ran 1018 times and 31 ms.
and previous code loop ran 3415587 times and took somewhere near 16 seconds.
however in case a solution does not exist my code ran more than few minutes so i stopped it and previous code ran near around 17 ms only and previous code works with negative numbers also.
so i thing some improvements can be done.
#!/usr/bin/python2
ylist = [1, 2, 3, 4, 5, 6, 7, 9, 2, 5, 3, -1]
print ylist
target = int(raw_input("enter the target number"))
for i in xrange(len(ylist)):
sno = target-ylist[i]
for j in xrange(i+1, len(ylist)):
if ylist[j] == sno:
print ylist[i], ylist[j]
This python code do what you asked, it will print the unique pair of numbers whose sum is equal to the target variable.
if target number is 8, it will print:
1 7
2 6
3 5
3 5
5 3
6 2
9 -1
5 3
I have found an answer which has run-time complexity O(n) and space complexity about O(2n), where n is the length of the list.
The answer satisfies the following constraints:
List can contain duplicates, e.g. [1,1,1,2,3] and you want to find pairs sum to 2
List can contain both positive and negative integers
The code is as below, and followed by the explanation:
def countPairs(k, a):
# List a, sum is k
temp = dict()
count = 0
for iter1 in a:
temp[iter1] = 0
temp[k-iter1] = 0
for iter2 in a:
temp[iter2] += 1
for iter3 in list(temp.keys()):
if iter3 == k / 2 and temp[iter3] > 1:
count += temp[iter3] * (temp[k-iter3] - 1) / 2
elif iter3 == k / 2 and temp[iter3] <= 1:
continue
else:
count += temp[iter3] * temp[k-iter3] / 2
return int(count)
Create an empty dictionary, iterate through the list and put all the possible keys in the dict with initial value 0.
Note that the key (k-iter1) is necessary to specify, e.g. if the list contains 1 but not contains 4, and the sum is 5. Then when we look at 1, we would like to find how many 4 do we have, but if 4 is not in the dict, then it will raise an error.
Iterate through the list again, and count how many times that each integer occurs and store the results to the dict.
Iterate through through the dict, this time is to find how many pairs do we have. We need to consider 3 conditions:
3.1 The key is just half of the sum and this key occurs more than once in the list, e.g. list is [1,1,1], sum is 2. We treat this special condition as what the code does.
3.2 The key is just half of the sum and this key occurs only once in the list, we skip this condition.
3.3 For other cases that key is not half of the sum, just multiply the its value with another key's value where these two keys sum to the given value. E.g. If sum is 6, we multiply temp[1] and temp[5], temp[2] and temp[4], etc... (I didn't list cases where numbers are negative, but idea is the same.)
The most complex step is step 3, which involves searching the dictionary, but as searching the dictionary is usually fast, nearly constant complexity. (Although worst case is O(n), but should not happen for integer keys.) Thus, with assuming the searching is constant complexity, the total complexity is O(n) as we only iterate the list many times separately.
Advice for a better solution is welcomed :)

Ceiling of the element in sorted array

Hi I am doing DSA problems and found a problem called as ceiling of the element in sorted array. In this problem there is a sorted array and if the target element is present in the sorted array return the target. If the target element is not found in the sorted array we need to return the smallest element which is greater than target. I have written the code and also done some test cases but need to check if everything works correctly. This problem is not there on leetcode where I could run it with many different cases. Need suggestion/feedback if the problem is solved in the correct way and if it would give correct results in all cases
class Solution:
#My approch
def smallestNumberGreaterThanTarget(self, nums, target):
start = 0
end = len(nums)-1
if target > nums[end]:
return -1
while start <= end:
mid = start + (end-start)//2
if nums[mid] == target:
return nums[mid]
elif nums[mid] < target:
if nums[mid+1] >= target:
return nums[mid+1]
start = mid + 1
else:
end = mid-1
return nums[start]
IMO, the problem can be solved in a simpler way, with only one test inside the main loop. The figure below shows a partition of the real line, in subsets associated to the values in the array.
First, we notice that for all values above the largest, there is no corresponding element, and we will handle this case separately.
Now we have exactly N subsets left, and we can find the right one by a dichotomic search among these subsets.
if target > nums[len(nums)-1]:
return None
s, e= 0, len(nums);
while e > s:
m= e + ((s - e) >> 1);
if target > nums[m]:
s= m+1
else:
e= m
return s
We can formally prove the algorithm using the invariant nums[s-1] < target <= nums[e], with the fictional convention nums[-1] = -∞. In the end, we have the bracketing nums[s-1] < target <= nums[s].
The code errors out with an index out-of-range error for the empty list (though this may not be necessary because you haven't specified the problem constraints).
A simple if guard at the top of the function can fix this:
if not nums:
return -1
Otherwise, it seems fine to me. But if you're still not sure whether or not your algorithm works, you can always do random testing (e.g. create a linear search version of the algorithm and then randomly generate inputs to both algorithms, and then see if there's any difference).
Here's a one-liner that you can test against:
input_list = [0, 1, 2, 3, 4]
target = 0
print(next((num for num in input_list if num >= target), -1))

Python: Binary Search - "Find the first occurrence"

having a bit of trouble with this one. I have included what I have below. When I submit it, it keeps saying "Program timed out" for some reason. I am not sure what to do next. It works to a certain degree, ie, some tests work, not the last test just doesn't work. What do you suggest?
I have included a screenshot of the question, as well as what I have so far.
Here is the note (pseudocode) from class, I just need to modify this to modify it to print the first occurance of the target in the ordered_list. If the target does not exist in the list, it must return None.
Thank you in advance!!
The Question:
You are to write the code of a Python function
binsearch first(ordered list, target)
that, given a nonempty ordered list of items and a target item, all of the same type, returns the index of the first occurrence of the target in the list, if the target is in the list, and None otherwise.
For example, the call binsearch first([1, 3, 3, 7, 9], 3) should return 1 since the first 3 is at index 1. Similarly, the call binsearch first([1, 3, 3, 7, 9], 9) should return 4, and the call binsearch first([1, 3, 3, 7, 9], 5) should return None.
You may not assume anything about the type of the items, other than that they are orderable. For example, items could be strings and the call binsearch first(["Alice", "Bob", "Chloe", "Chloe", "Dave"], "Chloe") should return 2.
Your program will be evaluated for efficiency and style. For full credit, it may only make a single test for equality (it may only have a single “==” comparison which, additionally, may not be within any loop). That is, the only equality test happens at the end of execution, just before returning.
Restrictions: Recursion is not allowed for this problem. allowed to use any operations other than
Furthermore, you are not
, − , // , × , < ,
and (once) ==
Of course, all builtins and library functions related to search are also disallowed: you have to do the coding yourself.
def binsearch_first(ordered_list, target):
left = 0
right = len(ordered_list) - 1
count = 0
while left <= right:
mid = (left + right) // 2
count = count + 1
if ordered_list[mid] == target:
while mid > 0 and ordered_list[mid - 1] == target:
mid = mid - 1
return mid
elif target < ordered_list[mid]:
right = mid - 1
else:
left = mid + 1
return None
Find the first occurrence
The only operator that works with string and integer is <.
We have to make use of the fact that it is an ordered list - arranged in increasing order.
def binsearch(orderedlist,target):
candidate = 0
for i in range(len(orderedlist)):
if orderedlist[i] < target:
candidate = candidate
else:
if i+1 < len(orderedlist):
if orderedlist[i] < orderedlist[i+1]:
#it is an ordered list so if i+1 is not bigger than i, it must be equal
candidate = candidate
else:
candidate = i
break # can you use break?
if orderedlist[candidate] == target:
return candidate
else:
return None
I am not a CS student hence cannot comment on the effectiveness of the program, but you can achieve your goal by using a simple for loop
def binsearch_first(ordered_list, target):
i=0
for ele in ordered_list:
if ele == target:
return i
break
else:
i+=1
return None
Result of this is:
>>> binsearch_first([1, 3, 3, 7, 9], 3)
1
>>> binsearch_first(["Alice", "Bob", "Chloe", "Chloe", "Dave"], "Chloe")
2
Regards

understanding while loop in python

I am quite new to Python and trying to learn algorithms, I wanted to ask why is it logically wrong if I use low < hi when looking over the list, the correct logical operation is low <= hi, what is the edge case it is preventing.
def binary_search(input_array, value):
"""Your code goes here."""
#O(log(n))
low = 0
hi = len(input_array) - 1
while low <= hi: #why cant it be low < hi
mid = (low + hi)//2
if input_array[mid] == value:
return mid
elif input_array[mid] < value:
print(low, hi)
low = mid + 1
else:
hi = mid - 1
return -1
test_list = [1,3,9,11,15,19,29]
test_val1 = 25
test_val2 = 15
print(binary_search(test_list, test_val1))
print(binary_search(test_list, test_val2))
Consider you have only one element [1] and you are searching for 1.
< : return -1 Since you would just skip the loop
<= : return the correct value
When your target is, for example in your case, 15:
1st iteration indexes: low == 4, hi == 6
2nd iteration indexes: low == 4, hi == 4
If you use low < high you won't drop into your loop on the second iteration, since 4 < 4 returns false. Your program will think it can't find the value even though the value is on index 4.
You could if you had written
hi = len(input_array)
while low < hi: # now this works.
...
In fact generally this it's how I generally write these loops. Take advantage of the fact that len() will always be 1 more than the last index, and run the loop only while the index is < that value.
If you subtract 1 from the len(input_array) so that the max value for hi is the last index in the array, then in order to ruin the loop on that last element you need the = part of low <= hi
Generally it's (mentally) easier to instead set hi = len(input_array), which is 1 past the last index, and then run the loop only while low<hi .
(Less typing, and less mental gymnastics).
In this case, once low==hi we've gone past the last index, and it would then be out of bounds.
The "edge case" you refer to is that you want to make sure to run the loop on all elements (indexes).
So one way or another, you need to look at that last index/element out the array.
Basically there are two common ways to code this, but you cannot mix them up. Make sure your end condition is with respect to your initial condition.
What you set hi to (either the length of the array, or the last index of the array) determines whether you are going to use low < hi or low <= hi
We use <= instead of just < because in the case that the desired value happens to be the last index (when low=hi), we will need to include an equal sign to check for that case.

How to trace this recursive program

I was wondering if someone would be able to tell me how this program arrives at the correct answer. I have tried to trace it and am not sure where to go because it has an or, so I am confused, how it is supposed to be traced. Thanks for any clarification.
Write a function partition that takes a list of numbers xs, a starting position i, and a desired sum s and returns True or False depending on whether it is possible to find a subsequence of the list elements starting from position i whose sum is exactly s. Note that it is always possible to produce a subsequence of elements whose sum is exactly 0, namely the empty sequence of elements.
def partition(xs,i,s):
print i,s
if i == len(xs):
return s == 0
else:
return partition(xs,i+1,s-xs[i]) or partition(xs,i+1,s)
The rcviz module is a nice tool to help to visualise recursive functions:
The edges are numbered by the order in which they were traversed by the execution. 2. The edges are colored from black to grey to indicate order of traversal : black edges first, grey edges last.
If you follow the calls which are numbered 1-11 you can see exactly what is happening, i starts at 0 then goes to 1, 2 3 and finally 4, the last value on the left is partitition([1,2,3,4],4,-2) so it returns False for s == 0.
Next we go back to where i is 2 then again 3,4 and end up with partitition([1,2,3,4],4,1) so s == 1 again is False.
Next we go from step 6 ending with partitition([1,2,3,4],4,5) where yet again s == 0 is False.
Finally in the right we go from partitition([1,2,3,4],4,7) all the way down to partitition([1,2,3,4],4,0) where s == 0 is True and the function returns True.
If you take the first four calls, you can see how the flow goes and how s is changed.
partitition([1,2,3,4],1,7) # -> xs[i] = 1 s - 1 = 7
partitition([1,2,3,4],2,5) # -> xs[i] = 2 s - 2 = 5
partitition([1,2,3,4],2,5) # -> xs[i] = 3 s - 3 = 2
partitition([1,2,3,4],2,5) # -> xs[i] = 4 s - 4 = -2
s == 0 # -> False
Maybe this version, which is logically equivalent, makes it a bit clearer. The key is that return a or b is equivalent to if a: return a else: return b.
def partition(xs,i,s):
print i,s
if i == len(xs):
# Base case: If the goal is to sum to 0, we succeed.
return s == 0
else:
# First, try including element i in our sum:
first_try = partition(xs,i+1,s-xs[i])
if first_try:
return True
else:
# If first try failed, try not including element i
second_try = partition(xs,i+1,s)
return second_try
This is an explanation of how or works in this context:
return partition(xs,i+1,s-xs[i]) or partition(xs,i+1,s)
will return partition(xs,i+1,s-xs[i]) if this expression evaluates to True. If partition(xs,i+1,s-xs[i]) evaluates to False, partition(xs,i+1,s) will be returned (regardless of whether it evaluates to True or False).
Note you can test this with the following set of simple examples:
In [1]: 1 or 2 # When both are True, the first is returned.
Out[1]: 1
In [2]: 0 or 2 # When the first is False, the second is returned.
Out[2]: 2
In [4]: 0 or False # When both are False, the second is returned.
Out[4]: False
I think it becomes a lot easier to understand if it's written with better names and no indexing:
def can_add_to(numbers, sumwanted):
if not numbers:
return sumwanted == 0
first, *rest = numbers
return can_add_to(rest, sumwanted-first) or can_add_to(rest, sumwanted)
print(can_add_to([1, 4, 9, 25], 13))
This is the same as yours, only more readable. Explanation then:
If there are no numbers, then the answer is "yes" if and only if the wanted sum is zero, and we can say that right away.
Otherwise take the first number (and the rest). You can either use it for the sum or not.
Using it: can_add_to(rest, sumwanted-first) tells you whether the remaining sum (after subtracting first) can be made from the remaining numbers.
Not using it: can_add_to(rest, sumwanted-first) tells you whether the whole sum can be made from the remaining numbers alone.
The overall answer is "yes" if and only if you can make the sum with or without first. That's why you take the two sub-answers and or them together.

Categories

Resources