I am currently wondering how could we make a efficient implementation of LCS problem.
I found a way to find consecutive match (i.e. ngrams match) with tensor operation by comparing and shifting.
With two sequences x (len: n), y (len: m), the matrix:
e = x.eq(y.unsqueeze(1)) # [n x m]
we have: e[i, j] == 1 <=> x[i] == y[j], a n-gram match will be visible as a diagonal of ones.
Thus we can do the following:
# match_1 = [n x m] of {0, 1}
match_1 = x.eq(y.unsqueeze(1))
# match_2 = [(n-1) x (m-1)] matrix of {0, 1}
match_2 = match_1[:-1, :-1] * match_1[1:, 1:]
# etcetc
The LCS problem is more complicated as we allow gaps. It can be implemented using dynamic programing, in O(n x m), but it is not really protable to Pytorch right?
I tried, it’s super slow.
# considering two "sentences" (LongTensors of word indices)
# _ts, _tr of respective length n and m
table = torch.zeros(n+1, m+1)
_ts, _tr = ts, tr
for i in range(1, n+1):
for j in range(1, m+1):
if _ts[i-1] == _tr[j-1]:
_table[i, j] = _table[i-1, j-1] + 1
else:
_table[i, j] = max(_table[i-1, j], _table[i, j-1])
lcs = _table[n][m]
Any idea to make it more efficient?
Related
Im trying to implement merge sort in Python based on the following pseudo code. I know there are many implementations out there, but I have not been able to find one that followis this pattern with a for loop at the end as opposed to while loop(s). Also, setting the last values in the subarrays to infinity is something I haven't seen in other implementation. NOTE: The following pseudo code has 1 based index i.e. index starts at 1. So I think my biggest issue is getting the indexing right. Right now its just not sorting properly and its really hard to follow with the debugger. My implementation is at the bottom.
Current Output:
Input: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Merge Sort: [0, 0, 0, 3, 0, 5, 5, 5, 8, 0]
def merge_sort(arr, p, r):
if p < r:
q = (p + (r - 1)) // 2
merge_sort(arr, p, q)
merge_sort(arr, q + 1, r)
merge(arr, p, q, r)
def merge(A, p, q, r):
n1 = q - p + 1
n2 = r - q
L = [0] * (n1 + 1)
R = [0] * (n2 + 1)
for i in range(0, n1):
L[i] = A[p + i]
for j in range(0, n2):
R[j] = A[q + 1 + j]
L[n1] = 10000000 #dont know how to do infinity for integers
R[n2] = 10000000 #dont know how to do infinity for integers
i = 0
j = 0
for k in range(p, r):
if L[i] <= R[j]:
A[k] = L[i]
i += 1
else:
A[k] = R[j]
j += 1
return A
First of all you need to make sure if the interval represented by p and r is open or closed at its endpoints. The pseudocode (for loops include last index) establishes that the interval is closed at both endpoints: [p, r].
With last observation in mind you can note that for k in range(p, r): doesn't check last number so the correct line is for k in range(p, r + 1):.
You can represent "infinity" in you problem by using the maximum element of A in the range [p, r] plus one. That will make the job done.
You not need to return the array A because all changes are being done through its reference.
Also, q = (p + (r - 1)) // 2 isn't wrong (because p < r) but correct equation is q = (p + r) // 2 as the interval you want middle integer value of two numbers.
Here is a rewrite of the algorithm with “modern” conventions, which are the following:
Indices are 0-based
The end of a range is not part of that range; in other words, intervals are closed on the left and open on the right.
This is the resulting code:
INF = float('inf')
def merge_sort(A, p=0, r=None):
if r is None:
r = len(A)
if r - p > 1:
q = (p + r) // 2
merge_sort(A, p, q)
merge_sort(A, q, r)
merge(A, p, q, r)
def merge(A, p, q, r):
L = A[p:q]; L.append(INF)
R = A[q:r]; R.append(INF)
i = 0
j = 0
for k in range(p, r):
if L[i] <= R[j]:
A[k] = L[i]
i += 1
else:
A[k] = R[j]
j += 1
A = [433, 17, 585, 699, 942, 483, 235, 736, 629, 609]
merge_sort(A)
print(A)
# → [17, 235, 433, 483, 585, 609, 629, 699, 736, 942]
Notes:
Python has a handy syntax for copying a subrange.
There is no int infinity in Python, but we can use the float one, because ints and floats can always be compared.
There is one difference between this algorithm and the original one, but it is irrelevant. Since the “midpoint” q does not belong to the left range, L is shorter than R when the sum of their lengths is odd. In the original algorithm, q belongs to L, and so L is the longer of the two in this case. This does not change the correctness of the algorithm, since it simply swaps the roles of L and R. If for some reason you need not to have this difference, then you must calculate q like this:
q = (p + r + 1) // 2
In mathematics, we represent all real numbers which are greater than or equal to i and smaller than j by [i, j). Notice the use of [ and ) brackets here. I have used i and j in the same way in my code to represent the region that I am dealing with currently.
ThThe region [i, j) of an array covers all indexes (integer values) of this array which are greater or equal to i and smaller than j. i and j are 0-based indexes. Ignore the first_array and second_array the time being.
Please notice, that i and j define the region of the array that I am dealing with currently.
Examples to understand this better
If your region spans over the whole array, then i should be 0 and j should be the length of array [0, length).
The region [i, i + 1) has only index i in it.
The region [i, i + 2) has index i and i + 1 in it.
def mergeSort(first_array, second_array, i, j):
if j > i + 1:
mid = (i + j + 1) // 2
mergeSort(second_array, first_array, i, mid)
mergeSort(second_array, first_array, mid, j)
merge(first_array, second_array, i, mid, j)
One can see that I have calculated middle point as mid = (i + j + 1) // 2 or one can also use mid = (i + j) // 2 both will work. I will divide the region of the array that I am currently dealing with into 2 smaller regions using this calculated mid value.
In line 4 of the code, MergeSort is called on the region [i, mid) and in line 5, MergeSort is called on the region [mid, j).
You can access the whole code here.
I'm trying to implement an algorithm from Algorithmic Toolbox course on Coursera that takes an arithmetic expression such as 5+8*4-2 and computes its largest possible value. However, I don't really understand the choice of indices in the last part of the shown algorithm; my implementation fails to compute values using the ones initialized in 2 tables (which are used to store maximized and minimized values of subexpressions).
The evalt function just takes the char, turns it into the operand and computes a product of two digits:
def evalt(a, b, op):
if op == '+':
return a + b
#and so on
MinMax computes the minimum and the maximum values of subexpressions
def MinMax(i, j, op, m, M):
mmin = 10000
mmax = -10000
for k in range(i, j-1):
a = evalt(M[i][k], M[k+1][j], op[k])
b = evalt(M[i][k], m[k+1][j], op[k])
c = evalt(m[i][k], M[k+1][j], op[k])
d = evalt(m[i][k], m[k+1][j], op[k])
mmin = min(mmin, a, b, c, d)
mmax = max(mmax, a, b, c, d)
return(mmin, mmax)
And this is the body of the main function
def get_maximum_value(dataset):
op = dataset[1:len(dataset):2]
d = dataset[0:len(dataset)+1:2]
n = len(d)
#iniitializing matrices/tables
m = [[0 for i in range(n)] for j in range(n)] #minimized values
M = [[0 for i in range(n)] for j in range(n)] #maximized values
for i in range(n):
m[i][i] = int(d[i]) #so that the tables will look like
M[i][i] = int(d[i]) #[[i, 0, 0...], [0, i, 0...], [0, 0, i,...]]
for s in range(n): #here's where I get confused
for i in range(n-s):
j = i + s
m[i][j], M[i][j] = MinMax(i,j,op,m,M)
return M[0][n-1]
Sorry to bother, here's what had to be improved:
for s in range(1,n)
in the main function, and
for k in range(i, j):
in MinMax function. Now it works.
The following change should work.
for s in range(1,n):
for i in range(0,n-s):
I am trying to make array assignment in python, but it is very slow, is there any way to accelerate?
simi_matrix_img = np.zeros((len(annot), len(annot)), dtype='float16')
for i in range(len(annot)):
for j in range(i + 1):
score = 0
times = 0
if i != j:
x_idx = [p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5]
y_idx = [p2 for (p2, q2) in enumerate(annot[j]) if np.abs(q2 - 1) < 1e-5]
for idx in itertools.product(x_idx, y_idx):
score += simi_matrix_word[idx]
times += 1
simi_matrix_img[i, j] = score/times
else:
simi_matrix_img[i, j] = 1.0
"annot" is a numpy array. Is there any way to accelerate it?
I think the indent for this line is wrong:
simi_matrix_img[i, j] = score/times
you want to perform that assignment after all the product iterations. But since it's the last assignment that takes, the results will be the same.
Here's a partial reworking of your code
def foo1(annot, simi_matrix_word):
N = annot.shape[0]
simi_matrix_img = np.zeros((N,N))
for i in range(N):
for j in range(i + 1):
if i != j:
x_idx = np.nonzero(annot[i])[0]
y_idx = np.nonzero(annot[j])[0]
idx = np.ix_(x_idx, y_idx)
# print(idx, simi_matrix_word[idx])
score = simi_matrix_word[idx].mean()
simi_matrix_img[i, j] = score
else:
simi_matrix_img[i, j] = 1.0
return simi_matrix_img
For a small test case, it returns the same thing:
annot=np.array([[1,0,1],[0,1,1]])
simi_matrix_word = np.arange(12, dtype=float).reshape(3,4)
[[ 1. 0.]
[ 7. 1.]]
That gets rid of all the inner iterations. Next step would be reduce the outer iterations. For example start with np.eye(N), and just iterate on the lower tri indices:
In [169]: np.eye(2)
Out[169]:
array([[ 1., 0.],
[ 0., 1.]])
In [170]: np.tril_indices(2,-1)
Out[170]: (array([1]), array([0]))
Note that for a 2 row annot, we are only calculating one score, at [1,0].
Replacing nonzero with boolean indexing:
def foo3(annot, simi_matrix_word):
N = annot.shape[0]
A = annot.astype(bool)
simi_matrix_img = np.eye(N,dtype=float)
for i,j in zip(*np.tril_indices(N,-1)):
score = simi_matrix_word[A[i],:][:,A[j]]
simi_matrix_img[i, j] = score.mean()
return simi_matrix_img
or this might speed up the indexing a bit:
def foo4(annot, simi_matrix_word):
N = annot.shape[0]
A = annot.astype(bool)
simi_matrix_img = np.eye(N,dtype=float)
for i in range(1,N):
x = simi_matrix_word[A[i],:]
for j in range(i):
score = x[:,A[j]]
simi_matrix_img[i, j] = score.mean()
return simi_matrix_img
Since the number of nonzero values for each row of annot can differ, the number of terms that are summed for each score also differs. That strongly suggests that further vectorization is impossible.
(1) You could use generators instead of list comprehension where possible. For example:
x_idx = (p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5)
y_idx = (p2 for (p2, q2) in enumerate(annot[j]) if np.abs(q2 - 1) < 1e-5)
With this, you iterate only once over those items (in for idx in itertools.product(x_idx, y_idx)), as opposed to twice (once for constructing the list then again in said for loop).
(2) What Python are you using? If <3, I have a hunch that a significant part of the problem is you're using range(), which can be expensive in connection with really large ranges (as I'm assuming you're using here). In Python 2.7, range() actually constructs lists (not so in Python 3), which can be an expensive operation. Try achieving the same result using a simple while loop. For example, instead of for i in range(len(annot)), do:
i=0
while i < len(annot):
... do stuff with i ...
i += 1
(3) Why call len(annot) so many times? It doesn't seem like you're mutating annot. Although len(annot) is a fast O you could store the length in a var, e.g., annot_len = len(annot), and then just reference that. Wouldn't scrape much off though, I'm afraid.
I have two arrays of points: list1 with list1.shape = [N, 3] and list2 with list2.shape = [M, 3]. Within N, M: the total number of point, (x, y, z) are 3 coordinates in 3D.
Now I want to check if each point of list1 is within a distance r with each point of list2. A nature way to do this is a for loop:
for i in range(N):
for j in range(M):
if (list1[i, 0] - list2[j, 0])**2 + (list1[i, 1] - list2[j, 1])**2 + (list1[i, 2] - list2[j, 2])**2 < r**2:
''' Return 1 if list1[i] is within list2[j] '''
return True
else:
''' Return 0 if list1[i] is not within list2[j] '''
return False
But it's horribly slow. Could I do the more efficient way?
You can use the outer operations to calculate the distance matrix without the for loops:
s = np.subtract.outer
d_matrix = s(list1[:,0], list2[:,0])**2
d_matrix += s(list1[:,1], list2[:,1])**2
d_matrix += s(list1[:,2], list2[:,2])**2
Where each line is the distance of point i about all points. To find out if point i is close to any point using your criterion:
a = np.zeros_like(list1[:,0])
a[np.any(d_matrix < r**2, axis=1)] = 1
I am trying to generate 3-tuples (x,y,z) in python such that no two of x , y or z have the same value. Furthermore , the variables x , y and z can be defined over separate ranges (0,p) , (0,q) and (0,r). I would like to be able to generate n such tuples. One obvious way is to call random.random() for each variable and check every time whether x=y=z . Is there a more efficient way to do this ?
You can write a generator that yields desired elements, for example:
def product_no_repeats(*args):
for p in itertools.product(*args):
if len(set(p)) == len(p):
yield p
and apply reservoir sampling to it:
def reservoir(it, k):
ls = [next(it) for _ in range(k)]
for i, x in enumerate(it, k + 1):
j = random.randint(0, i)
if j < k:
ls[j] = x
return ls
xs = range(0, 3)
ys = range(0, 4)
zs = range(0, 5)
size = 4
print reservoir(product_no_repeats(xs, ys, zs), size)