Non-tail recursion within a for loop - python

Given an array of numbers, find the length of the longest increasing subsequence in the array. The subsequence does not necessarily have to be contiguous.
For example, given the array [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15], the longest increasing subsequence has length 6: it is 0, 2, 6, 9, 11, 15.
One of the solutions to the above problem uses non-tail recursion within a for loop, and I am having trouble making sense of it. I don't understand when the code after the recursive call in the for loop is executed, and I can't visualize the entire execution process of the whole solution.
def longest_increasing_subsequence(arr):
if not arr:
return 0
if len(arr) == 1:
return 1
max_ending_here = 0
for i in range(len(arr)):
ending_at_i = longest_increasing_subsequence(arr[:i])
if arr[-1] > arr[i - 1] and ending_at_i + 1 > max_ending_here:
max_ending_here = ending_at_i + 1
return max_ending_here
The description of the solution is as follows:
Assume that we already have a function that gives us the length of the longest increasing subsequence. Then we’ll try to feed some part of our input array back to it and try to extend the result. Our base cases are: the empty list, returning 0, and an array with one element, returning 1.
Then,
For every index i up until the second to last element, calculate longest_increasing_subsequence up to there.
We can only extend the result with the last element if our last element is greater than arr[i] (since otherwise, it’s not increasing).
Keep track of the largest result.
Source: https://www.dailycodingproblem.com/blog/longest-increasing-subsequence/
**EDITS**:
What I mean by I don't understand when the code after the recursive call in the for loop is executed. Here is my understanding:
Some code calls lis([0, 8, 4, 12, 2]).
arr = [0, 8, 4, 12, 2] doesn't meet either of the two base cases.
The for loop makes the first call when i = 0 in the line ending_at_i = lis([]). This is the first base case, so it returns 0. I can't understand why control doesn't return to the for loop so that ending_at_i is set to 0, and the if condition is executed (because it surely isn't checked else [][-1] would throw an error), after which we can move on to the for loop making the second call when i = 1, third call when i = 2 which would branch into two calls, and so on.

Here's how this function works. Fist, it handles the degenerate cases where the list length is 0 or 1.
It then looks for the solution when the list length is >= 2. There are two possibilities for the longest sequence: (1) It may contain the last number in the list, or (2) It may not contain the last number in the list.
For case (1), if the last number in the list is in the longest sequence, then the number before it in the longest sequence must be one of the earlier numbers. Suppose the number before it in the sequence is at position x. Then the longest sequence is the longest sequence taken from the numbers in the list up to and including x, plus the last number in the list. So it recurses on all of the possible positions of x, which are 0 through the list length minus 2. It iterates i over range(len(arr)), which is 0 through len(arr)-1). But it then uses i as the upper bound in the slice, so the last element in the slice corresponds to indices -1 through len(arr)-2. In the case of -1, this is an empty slice, which handles the case where all values in the list before the last are >= the last element.
This handles case (1). For case (2), we just need to find the largest sequence from the sublist that excludes the last element. However, this check is missing from the posted code, which is why the wrong answer is given for a list like [1, 2, 3, 0]:
>>> longest_increasing_subsequence([1, 2, 3, 0])
0
>>>
Obviously the correct answer in this case is 3, not 0. This is fairly easy to fix, but somehow was left out of the posted version.
Also, as others have pointed out, creating a new slice each time it recurses is unnecessary and inefficient. All that's needed is to pass the length of the sublist to achieve the same result.

Here is a (hopefully good enough) explanation:
ending_at_i = the length of the LIS when you clip arr at the i-th index (that is, considering elements arr[0], arr[1], ..., arr[i-1].
if arr[-1] > arr[i - 1] and ending_at_i + 1 > max_ending_here
if arr[-1] > arr[i - 1] = if the last element of arr is greater than the last element of the part of arr correponding to ending_at_i
if ending_at_i + 1 > max_ending_here = if appending the last element of arr to the LIS found during computing ending_at_i is larger than the current best LIS
The recursive step is then:
Let an oracle tell you the length of the LIS in arr[:i] (= arr[0], arr[1], ..., arr[i-1])
realize that, if the last element of arr, that is, arr[-1], is larger than the last element of arr[:i], then whatever the LIS inside arr[:i] was, if you take it and append arr[-1], it will still be an LIS, except that it will be one element larger
Check whether arr[-1] is actually larger than arr[i-1], (= arr[:i][-1])
Check whether appending arr[-1] to the LIS of arr[:i] creates the new optimal solution
Repeat 1., 2., 3. for i in range(len(arr)).
The result will be the knowledge of the length of the LIS inside arr.
All that being said, since the recursive substep of this algorithm runs in O(n), there are very few worse feasible solutions to the problem.
You tagged dynamic programming, however, this is precisely the anti-example of such. Dynamic programming lets you reuse the solutions to subproblems, which is precisely what this algorithm doesn't do, hence wasting time. Check out a DP solution instead.

Related

How do I derive this part of the algorithm

I'm a beginner at programming Python and came across this program.
This algorithm is used to reverse a list:
mylist = [1,2,3,4]
reverse = mylist[:]
for i in range(len(reverse)//2):
reverse[i], reverse[len(reverse) -i -1] = reverse[len(reverse) -i -1], reverse[i]
This algorithm is based on the logic that the swapping process will happen only till the len/2 element for lists with even lengths or the len/2 element in case of lists with a odd length because if the swapping process occurred till the last element, the list would remain the same as it was in the beginning.
I understood what the below part does, but how do I derive it, please explain the logic:
reverse[len(reverse) -i -1]
len(reverse) returns the number of elements the list has. To actually use it as an index you need to subtract 1 from it as indices start from 0. Next, we also subtract i from it as we are moving i positions away from both ends of the list and swapping them.
So if i=1 reverse[i] points to 2 while reverse[len(reverse)-i-1 points to 3.
You can just print the contents of these variables:
lst = [1, 2, 3, 4, 5, 6]
for i in range(len(lst)//2):
print(i, len(lst) - i - 1)
This will print
0 5
1 4
2 3
Therefore,
for the index of the first element (i = 0), len(lst) - i - 1 will return the index of the last element (i = 5).
for the index of the second element (i = 1), len(lst) - i - 1 will return the index of the second to last element (i = 4).
And so on.

longest increasing subsequence problem - n log n solution that returns the actual subsequence - explanation/clarification needed

I've tried to implement the n log n solution to the longest increasing subsequence problem (where you need to find a longest subsequence where each element is larger than the previous one of a given sequence), one which will find the actual subsequence that is longest (not just its length). I've worked off of this video - https://www.youtube.com/watch?v=S9oUiVYEq7E - but sadly I don't think the algorithm shown on the video is correct - it seems to at least work for the exact input that is shown on the video, but doesn't work for others, such as [1, 8, 6, 4, 9, 8, 3, 5, 2, 7, 1, 9, 5, 7].
from bisect import bisect_left, bisect_right
from math import floor, ceil
sequence = [3, 4, -1, 5, 8, 2, 3, 12, 7, 9, 10]
indexes = [0]
helper = [-1] * len(sequence)
for i in range(1, len(sequence)):
if len(indexes) == 0 or sequence[i] > sequence[indexes[-1]]:
indexes.append(i)
helper[i] = indexes[-2]
else:
ceiltable = bisect_right([sequence[x] for x in indexes], sequence[i])
indexes[ceiltable] = i
if ceiltable > 0:
helper[i] = indexes[ceiltable - 1]
solution = [sequence[x] for x in indexes]
print(f"longest increasing subsequence is {solution}, and has a lenght of {len(solution)}")
And my question(s) are - can anyone confirm/disconfirm whether the algorithm shown in that video is actually correct and what might be wrong with my implementation of it? Also, can I ask anyone to provide a simple explanation/pseudocode/mockup of the n log n solution of this problem? I tried searching of course, But I don't think there is anything that really explains how this solution works, or specifically how an implementation of it would work - and again, just to note, I also have to return the actual subsequence, not just the length.
The video you refer to explains he algorithm correctly.
There are two issues in your implementation:
You should use bisect_left instead of bisect_right, as otherwise you will allow solutions that are in fact non-decreasing sequences (with potential duplicate values) instead of strictly increasing sequences. And bisect_right may also result in an index that is equal to the length of the list, resulting in an invalid index access error. (Side note: If you really want to use bisect_right and find non-decreasing sequences, then make the preceding if condition >=)
The code does not translate the gathered data correctly into a solution. You really need to use the helper list to trace back the solution. Here is the code you could use:
solution = []
i = indexes[-1] # start with the index of the last value of the longest sequence
while i >= 0:
solution.append(sequence[i])
i = helper[i] # Look up what the optimal predecessor is of that index
solution.reverse() # Reverse the solution since we populated it in reverse order
Other remark
The way you perform binary search is not efficient because you iterate the whole list with a list comprehension. That defeats the efficiency offered by binary search. You should have the values ready for binary search, so keep them in a separate list that you maintain throughout the algorithm, and don't do such a list comprehension any more at the time of the binary search.

Fast way to check consecutive subsequences for total

I have a list (up to 10,000 long) of numbers 0, 1, or 2.
I need to see how many consecutive subsequences have a total which is NOT 1. My current method is to for each list do:
cons = 0
for i in range(seqlen+1):
for j in range(i+1, seqlen+1):
if sum(twos[i:j]) != 1:
cons += 1
So an example input would be:
[0, 1, 2, 0]
and the output would be
cons = 8
as the 8 working subsequences are:
[0] [2] [0] [1,2] [2, 0] [0, 1, 2] [1, 2, 0] [0, 1, 2, 0]
The issue is that simply going through all these subsequences (the i in range, j in range) takes almost more time than is allowed, and when the if statement is added, the code takes far too long to run on the server. (To be clear, this is only a small part of a larger problem, I'm not just asking for the solution to an entire problem). Anyway, is there any other way to check faster? I can't think of anything that wouldn't result in more operations needing to happen every time.
I think I see the problem: your terminology is incorrect / redundant. By definition, a sub-sequence is a series of consecutive elements.
Do not sum every candidate. Instead, identify every candidate whose sum is 1, and then subtract that total from the computed quantity of all sub-sequences (simple algebra).
All of the 1-sum candidates are of the regular expression form 0*10*: a 1 surrounded by any quantity of 0s on either or both sides.
Identify all such maximal-length strings. FOr instance, in
210002020001002011
you will pick out 1000, 000100, 01, and 1. For each string compute the quantity of substrings that contain the 1 (a simple equation on the lengths of the 0s on each side). Add up those quantities. Subtract from the total for the entire input. There's you answer.
Use sliding window technique to solve these type of problem. Take two variable to track first and last to track the scope of window. So you start with sum equal to first element. If the sum is larger than required value you subtract the 'first' element from sum and increment sum by 1. If the sum is smaller than required you add next element of 'last' pointer and increment last by 1. Every time sum is equal to required increment some counter.
As for NOT, count number of sub-sequence having '1' sum and then subtract from total number of sub-sequence possible, i.e. n * (n + 1) / 2

Index out of range confusion

So i am new to programming, and i am having trouble with index out of range errors. Quick example:
I have a list, lst = (5,7,8,9,10).
I want to remove every even number, and every number to the right of an even number.
I would approach this problem by getting the index of every even number, 'i' , and removing lst[i] and lst [i+1]. That will not work when the last number is even because there is no lst [i+1] after the last element in the list.
I have run into this issue on several basic problems i have been working on. My approach to solving this is probably wrong, so i would like to know:
How can i/Can i solve the problem this way, whether it is efficient or not?
What would be the most efficient way to solve this problem?
Welcome to the club! Programming is a lot of fun and something you can always improve upon with incremental progress. I'm going to try to be exhaustive with my answer.
With lists (also known as arrays) remember that a list and its indexes are zero-based. What this means is that an array's indexes start at the number 0 (not number 1 like you would do in normal counting).
arr = [5, 7, 8, 9, 10]
# If you want to access the first element of the array
# then you would use the 0 index. If you want the Second
# element you use index 1.
print(arr[0]) # prints 5 or the 1st element
print(arr[1]) # prints 7 or the 2nd element
I would not use your stand looping technique like for or while in this case because you are removing elements are you are going for the array. If you delete the item as you are looping you are changing the length of the array.
Instead, you could create a new array from looping and only adding or appending odd values to this new array.
arr = [5, 7, 8, 9, 10]
new_arr = []
for idx, val in enumerate(arr):
if idx % 2 == 1:
new_arr.append(val)
return new_arr # yields [7,9] or this process creates a new array of odd elements
In addition, remember when you are using [i+1] while you are indexing through loop in makes sense to stop the loop an element early to avoid an out of index range error.
Do this (no error)
for idx in range(len(arr)-1):
# pseudocode
print(arr[i] + arr[i+1])
instead of this (out of index error). The reason being is that on the last element if you try to add 1 to last index and then access a value that does not exist then an error will be returned:
for idx in range(len(arr)):
# pseudocode
print(arr[i] + arr[i+1])
arr = [5, 7, 8, 9, 10]
# if you try to access arr[5]
# you will get an error because the index
# and element do not exist
# the last element of arr is arr[4] or arr[-1]
arr[5] # yields an out of index error
There are many Pythonic (almost like a colloqial phrase specific to python) ways to accomplish your goal that are more efficient below.
You can use slicing, spacing and the del (delete statment) to remove even number elements
>>> arr = [5, 7, 8, 9, 10]
>>> del arr[::2] # delete even numbers # if you wanted to delete odd numbers del arr[1::2]
>>> arr
[7, 9]
Or a list comprehension to create a new list while looping through some conditional to filter the even numbers out:
new_arr = [elem for idx, elem in enumerate(arr) if idx % 2 == 0]
The % operator is used to see if there is a remainder from division. So if idx is 10. Then 10 % 2 == 0 is true because 2 is able to divide into 10 five times and the remainder is 0. Therefore, the element is even. If you were checking for odd the condition would be:
idx % 2 == 1
You can find further explanation of these Python methods from this great Stack Overflow post here
One issue you may run into is your list indexes shifting on you during removal. One way around this is to sort the indexes to be removed in descending order and remove them first.
Here is an example of how you could accomplish what you are looking for:
myList = [5, 7, 8, 9, 10]
# use list comprehension to get indexes of even numbers into a list.
# num % 2 uses the modulus operator to find numbers divisible by 2
# with a remainder of 0.
even_number_indexes = [idx for idx, num in enumerate(myList) if num % 2 == 0]
# even_number_indexes: [2, 4]
# sort our new list descending
even_number_indexes.reverse()
# even_number_indexes: [4, 2]
# iterate over even_number_indexes and delete index and index + 1
# from myList by specifying a range [index:index + 2]
for index in even_number_indexes:
del myList[index:index + 2]
print(myList)
output: [5, 7]
You can check if i+1 is greater than (Edit: or equal to) the length of the list, and if it is, not execute the code.
You can also handle this in a try/except block.
As to the efficiency of this method of solving, seems fine to me. One gotcha in this approach is that people try to iterate over the list while modifying it, which can lead to unknown errors. If you're using the remove() function, you probably want to do it with a copy of the list.

List index out of range error while performing a binary search

I attempted to create a function that takes an ordered list of numbers and a given number, and decides whether or not the given number is inside the list. I am trying to use a binary search to accomplish this task.
I have two steps:
First, I am making list1 smaller by only taking the numbers in list1 that are smaller than the given number, and then appending those numbers into a new list, called newlist.
Next, in the while loop, I am basically taking all the numbers that are less than the number in the middle of the newlist and removing them, repeating that process multiple times until there is only one number in newlist. From there, I would compare that number to the given number. My code is shown below.
list1 = [1, 3, 5, 6, 8, 14, 17, 29, 31]
number = 7
def func(list1, number):
newlist = []
for x in list1:
if x < number:
newlist.append(x)
else:
continue
while len(newlist) > 1:
for x in range(0, len(newlist) - 1):
if newlist[x] < newlist[round(len(newlist) / 2)]:
newlist.remove(newlist[x])
else:
continue
if newlist[0] == number:
return True
else:
return False
print(func(list1, number))
I am receiving an error at line 36 (if newlist[x] < newlist[round(len(newlist) / 2)]:), that the list index is out of range. I think that the problem is that as newlist is getting smaller and smaller, the x value set by range(0, len(newlist) - 1) is staying the same?? If that is the case, I am unsure of how to remedy that. Much thanks in advance.
The issue is this bit right here:
for x in range(0, len(newlist) - 1):
if newlist[x] < newlist[round(len(newlist) / 2)]:
newlist.remove(newlist[x])
First, you're iterating over the list
[0, 1, 2, ..., len(newlist) - 1]
This list is generated when you start the loop, meaning that if len(newlist) is 7 at the beginning, the list will always go up to 6, regardless of whether things are removed from newlist, which you later do. This is what causes your error, since at some point you've removed enough elements that your list is now, say, three elements large, but python is trying to access the fifth element because the list it's iterating over isn't newlist, it's [0, 1, 2, 3, 4, 5, 6].
To fix this, you could (for example) replace the for loop with this:
x = 0
while x < len(newlist - 1):
if newlist[x] < newlist[round(len(newlist) / 2)]:
newlist.pop(x) # simple way of saying "remove item at index x"
This is essentially the way of doing a C or Java-style for loop in python, and will avoid this type of problem.
I also understand that you have an issue with the logic in your code, which was pointed out in one of the comments above and gets more to the heart of your underlying issue, but this is at least an explanation of why this error occurred in the first place, so maybe it's helpful to you in the future

Categories

Resources