Logic error in my Longest Common Subsequence python - python

I have implemented solution of Longest Common Subsequence using Dynamic programming in python. For those who don't know LCS here's the link.
https://www.tutorialspoint.com/design_and_analysis_of_algorithms/design_and_analysis_of_algorithms_longest_common_subsequence.htm
My code is not returning the the most optimal answer. What is wrong in my logic ?
import enum
class LCS:
class Dir(enum.Enum):
up = 1
diagonal = 2
left = 3
none = 0
def LCS(self, x, y):
self.DP = {}
m = len(x) - 1
n = len(y) - 1
self.recursion(x, y, m, n)
print(self.DP)
self.printLCS(x, m, n)
def recursion(self, x, y, i, j):
if i == 0 or j == 0:
return [0, self.Dir.none]
else:
if (i, j) not in self.DP:
if x[i] == y[j]:
cost = self.recursion(x, y, i - 1, j - 1)[0] + 1
dir = self.Dir.diagonal
else:
first = self.recursion(x, y, i - 1, j)
second = self.recursion(x, y, i, j - 1)
if first[0] >= second[0]:
cost = first[0]
dir = self.Dir.up
else:
cost = second[0]
dir = self.Dir.left
self.DP[(i, j)] = [cost, dir]
return self.DP[(i, j)]
def printLCS(self, string, i, j):
if i == 0 or j == 0:
return
elif self.DP[(i, j)][1] == self.Dir.diagonal:
self.printLCS(string, i - 1, j - 1)
print(string[i], end="")
elif self.DP[(i, j)][1] == self.Dir.up:
self.printLCS(string, i - 1, j)
else:
self.printLCS(string, i, j - 1)
x = "BDCABA"
y = "ABCBDAB"
sol = LCS()
sol.LCS(x, y)
Expected = "BCBA", Actual = "DAB"

the problem is your base states.
the string in python is 0-base, cause of this the first character of string s is not s[1] its s[0] and you must end your recursion when you reach before first element not at first element.
just replace if i == 0 or j == 0: with if i == -1 or j == -1: in function printLCS and recursion then you will get output BDAB which is the one of correct answers.

Related

Need to optimize my mathematical py code with lists

Im not very sure if I will translate the assignment correctly, but the bottom line is that I need to consider the function f(i) = min dist(i, S), where S is number set of length k and dist(i, S) = sum(j <= S)(a (index i) - a(index j)), where a integer array.
I wrote the following code to accomplish this task:
n, k = (map(int, input().split()))
arr = list(map(int, input().split()))
sorted_arr = arr.copy()
sorted_arr.sort()
dists = []
returned = []
ss = 0
indexed = []
pop1 = None
pop2 = None
for i in arr:
index = sorted_arr.index(i)
index += indexed.count(i)
indexed.append(i)
dists = []
if (index == 0):
ss = sorted_arr[1:k+1]
elif (index == len(arr) - 1):
sorted_arr.reverse()
ss = sorted_arr[1:k+1]
else:
if index - k < 0:
pop1 = 0
elif index + k > n - 1:
pop2 = None
else:
pop1 = index - k
pop2 = index + k + 1
ss = sorted_arr[pop1:index] + sorted_arr[index + 1: pop2]
for ind in ss:
dists.append(int(abs(i - ind)))
dists.sort()
returned.append(str(sum(dists[:k])))
print(" ".join(returned))
But I need to speed up its execution time significantly.

Merge sort for python

Here are my merge and mergeSort functions. merge merges two separate arrays and mergesSort sorts and merges them with recursion:
def merge(arrL, arrR):
arrNew = []
d = len(arrL) + len(arrR)
i = 0
j = 0
for k in range (0, d-1):
if (arrL[i] < arrR[j]) :
arrNew.append(arrL[i]) # appends to the end of the array
i = i + 1
else:
arrNew.append(arrR[j])
j = j + 1
return arrNew
def mergeSort(arr, m, n):
if (n - m == 1):
return arr[m]
else:
p = (m + n) // 2
arrL = mergeSort(arr, m, p)
arrR = mergeSort(arr, p, n)
arrNew = merge(arrL, arrR)
return arrNew
I am getting an error from lines 32, 33 and 13:
d = len(arrL) + len(arrR)
TypeError: object of type 'int' has no len()
What is causing this error? merge is taking two arrays as inputs.
What is causing this error? merge is taking two arrays as inputs.
Except when it doesn't.
if(n-m == 1):
return arr[m]
This output of mergeSort is not an array.
My guess is it's this line
if(n-m == 1):
return arr[m]
which presumably is returning the content arr[m] of the array and not an array itself.
Since your code sorts arrays, when this naked element gets recursed on, it will generate the error you're seeing.
There are multiple problems in the code:
in mergeSort, you should return arr instead of arr[m] when the length of the array is less than 2. The test if (n - m == 1) does not allow for empty arrays:
if (n - m < 2):
return arr
in merge, the main loop should run d times, ie: instead of for k in range (0, d-1): you should write:
for k in range (d):
the test in the merge loop should also check if the index value in still in range. If the second slice is exhausted, the element arrL[i] should be selected:
for k in range (d):
if i < len(arrL) and (j >= len(arrR) or arrL[i] < arrR[j]):
arrNew.append(arrL[i])
i = i + 1
else:
arrNew.append(arrR[j])
j = j + 1
Here is a modified version:
def merge(arrL, arrR):
arrNew = []
i = 0
j = 0
for k in range(len(arrL) + len(arrR)):
if i < len(arrL) and (j >= len(arrR) or arrL[i] < arrR[j]):
arrNew.append(arrL[i])
i = i + 1
else:
arrNew.append(arrR[j])
j = j + 1
return arrNew
def mergeSort(arr, m, n):
if (n - m < 2):
return arr
else:
p = (m + n) // 2
arrL = mergeSort(arr, m, p)
arrR = mergeSort(arr, p, n)
return merge(arrL, arrR)

As Far From Land as Possible - DP Solution

I was working on a problem from Leetcode "As Far from Land as Possible", which can be found here: https://leetcode.com/problems/as-far-from-land-as-possible/
One solution that is guaranteed to work is to have 4 DP arrays, each of which start from different corners of the grid, and compute the distance to the nearest lands as you head to the opposite corner. In the end, taking the minimum of the elements in all 4 arrays should output the correct solution.
I tried writing a DP solution which would only try to do this one array, computing each by going through the four directions.
My code gives incorrect answers and I can't seem to find where the mistake is.
def maxDistance(self, grid: List[List[int]]) -> int:
N = len(grid)
dpfin = [[float('inf') for k in range(N)] for m in range(N)]
for k in range(N):
for m in range(N):
origk = k
if grid[k][m] == 1:
dpfin[k][m] = 0
elif k == 0 and m == 0:
pass
elif k == 0:
dpfin[k][m] = min(dpfin[k][m], dpfin[k][m-1] + 1)
k = N - 1 - k
dpfin[k][m] = min(dpfin[k][m], dpfin[k][m-1] + 1)
m = N - 1 - m
dpfin[k][m] = min(dpfin[k][m], dpfin[k][m-1] + 1)
k = origk
dpfin[k][m] = min(dpfin[k][m], dpfin[k][m-1] + 1)
elif m == 0:
dpfin[k][m] = min(dpfin[k][m], dpfin[k-1][m] + 1)
k = N - 1 - k
dpfin[k][m] = min(dpfin[k][m], dpfin[k-1][m] + 1)
m = N - 1 - m
dpfin[k][m] = min(dpfin[k][m], dpfin[k-1][m] + 1)
k = origk
dpfin[k][m] = min(dpfin[k][m], dpfin[k-1][m] + 1)
else:
dpfin[k][m] = min( min(dpfin[k-1][m],dpfin[k][m-1])+1,dpfin[k][m])
k = N - 1 - k
dpfin[k][m] = min( min(dpfin[k-1][m],dpfin[k][m-1])+1,dpfin[k][m])
m = N - 1 - m
dpfin[k][m] = min( min(dpfin[k-1][m],dpfin[k][m-1])+1,dpfin[k][m])
k = origk
dpfin[k][m] = min( min(dpfin[k-1][m],dpfin[k][m-1])+1,dpfin[k][m])
maxi = 0
for k in range(N):
for m in range(N):
maxi = max(maxi,dpfin[k][m])
if maxi == float('inf') or maxi == 0:
return -1
return maxi
I have to admit that it was pretty hard, but I think I have a first "unoptimized" solution. First the code, then the explanation:
class Solution:
def _rec_fixDP(self, grid, DP, r, c, max_offset, dist):
if r < 0 or c < 0 or r > max_offset or c > max_offset:
return
if grid[r][c] != 1 and (DP[r][c] == -1 or DP[r][c] > dist):
DP[r][c] = dist
self._rec_fixDP(grid, DP, r - 1, c, max_offset, dist + 1)
self._rec_fixDP(grid, DP, r, c - 1, max_offset, dist + 1)
self._rec_fixDP(grid, DP, r + 1, c, max_offset, dist + 1)
self._rec_fixDP(grid, DP, r, c + 1, max_offset, dist + 1)
def _dp_iteration(self, r, c, grid, DP):
if grid[r][c] == 0:
dp_left = -1 if r - 1 < 0 else DP[r - 1][c]
dp_up = -1 if c - 1 < 0 else DP[r][c - 1]
if dp_left == -1 and dp_up == -1:
DP[r][c] = -1
elif dp_left == -1 and dp_up != -1:
DP[r][c] = dp_up + 1
elif dp_left != -1 and dp_up == -1:
DP[r][c] = dp_left + 1
else:
DP[r][c] = min(dp_left, dp_up) + 1
else:
DP[r][c] = 0
max_offset = max(r, c)
self._rec_fixDP(grid, DP, r - 1, c, max_offset, 1)
self._rec_fixDP(grid, DP, r, c - 1, max_offset, 1)
def maxDistance(self, grid) -> int:
n = len(grid)
DP = [[-1 for i in range(n)] for m in range(n)]
for i in range(n):
r = i
for c in range(i):
self._dp_iteration(r, c, grid, DP)
c = i
for r in range(i + 1):
self._dp_iteration(r, c, grid, DP)
cur_max = -1
for i in DP:
cur_max = max(cur_max, max(i))
print(DP)
return cur_max if cur_max > 0 else -1
sol = Solution()
l = [[0,0,0],[0,0,0],[0,0,1]]
print(sol.maxDistance(l))
I have submitted it to leetcode.com and these are the results:
Runtime: 860 ms, faster than 19.20% of Python3 online submissions for
As Far from Land as Possible.
Memory Usage: 14 MB, less than 100.00%
of Python3 online submissions for As Far from Land as Possible.
Consider this simple grid
1 2 3
4 5 6
7 8 9
I am looping over the grid in this way: 1 4 2 5 7 8 3 6 9. This is because when i visit grid[r][c] I will have already visited grid[r - 1][c] and grid[r][c - 1].
Next, the logic is split in 2 parts: when grid[r][c] = 0 and when grid[r][c] = 1.
For grid[r][c] = 0,
DP[r][c] = min(DP[r - 1][c], DP[r][c - 1]) + 1
The exception is the
value -1: it means it has not been found a land cell yet.
For grid[r][c] = 1,
DP[r][c] = 0
Even if this part is pretty simple (the distance between the land cell
and itself is 0), you also need to fix all the previous calculated
distances, since they could all be -1 or they could have large values.
This last part is executed by _rec_fixDP, called until the current
distance is lesser than the stored one.
The complexity is easy to estimate: you need to loop over all the cell at least one (one in the best case), but the _rec_fixDP could revisit all the previous cells. So:
Best case: O(n^2)
Average case: O(n^4)
However I suspect it is possible to do it in less than O(n^4).

Have a list of lists with each element a tuple. Prob. specifying specific tuple element

def distances(a, b):
"""Calculate edit distance from a to b"""
# declare matrix and set top left box to 0 and none
cost = [[(0,0) for x in range(len(a)+1)] for y in range(len(b)+1)]
cost[0][0] = (0, None)
# fill in first row
for i in range(1, len(a)+1):
cost[i][0] = (i, Operation.DELETED)
#fill in first column
for j in range(1, len(b)+1):
cost[0][j] = (j, Operation.INSERTED)
# fill in rest of the table from left to right, one row at a time
i = 1
j = 1
while i < len(a) + 1:
while j < len(b) + 1:
if a[i-1] == b[j-1]:
cost[i][j] = (cost[i-1][j-1], Operation.SUBSTITUTED)
j += 1
else:
subcost = min(cost[i-1][j][0], cost[i][j-1][0], cost[i-1][j-1][0])
if subcost == cost[i-1][j][0]:
cost[i][j] = (cost[i-1][j][0] + 1, Operation.DELETED)
j += 1
elif subcost == cost[i][j-1][0]:
cost[i][j] = (cost[i][j-1][0] + 1, Operation.INSERTED)
j += 1
elif subcost == cost[i-1][j-1][0]:
cost[i][j] = (cost[i-1][j-1][0] + 1, Operation.SUBSTITUTED)
j += 1
i += 1
j = 1
return cost
Gives me this error message:
TypeError: '<' not supported between instances of 'tuple' and 'int'
and specifies
subcost = min(cost[i-1][j][0], cost[i][j-1][0], cost[i-1][j-1][0])
as the problem line
Each cost[i][j][0] should be specifying the first element of the jth tuple in the ith list of cost, which should be an int, yet it's saying they're tuples and I don't get why.

How to count common letters in order between two words in Python?

I have a string pizzas and when comparing it to pizza - it is not the same. How can you make a program that counts common letters (in order) between two words, and if it's a 60% match then a variable match is True?
For e.g. pizz and pizzas have 4 out of 6 letters in common, which is a 66% match, which means match must be True, but zzip and pizzasdo not have any letters in order in common, thus match is False
You can write a function to implement this logic.
zip is used to loop through the 2 strings simultaneously.
def checker(x, y):
c = 0
for i, j in zip(x, y):
if i==j:
c += 1
else:
break
return c/len(x)
res = checker('pizzas', 'pizz') # 0.6666666666666666
def longestSubstringFinder(string1, string2):
answer = ""
len1, len2 = len(string1), len(string2)
for i in range(len1):
match = ""
for j in range(len2):
if (i + j < len1 and string1[i + j] == string2[j]):
match += string2[j]
else:
if (len(match) > len(answer)): answer = match
match = ""
return answer
ss_len = len(longestSubstringFinder("pizz", "pizzas"))
max_len = max(len("pizza"),len("pizzas"))
percent = ss_len/max_len*100
print(percent)
if(percent>=60):
print("True");
else:
print("False")
Optimised algorithm using dynamic programming:
def LCSubStr(X, Y, m, n):
LCSuff = [[0 for k in range(n+1)] for l in range(m+1)]
result = 0
for i in range(m + 1):
for j in range(n + 1):
if (i == 0 or j == 0):
LCSuff[i][j] = 0
elif (X[i-1] == Y[j-1]):
LCSuff[i][j] = LCSuff[i-1][j-1] + 1
result = max(result, LCSuff[i][j])
else:
LCSuff[i][j] = 0
return result
This will directly return the length of LCS.

Categories

Resources