I created a simple implementation of Dynamic Time Warping in Python, but feel like it is a bit of a hack. I implemented the recurrence relation (or, at least, I believe I did!), but because in my case this involves a numpy array, I had to wrap it in a class to get memoisation to work (numpy arrays are mutable).
Wiki link to DTW: Dynamic Time Warping
Here is the code:
class DynamicTimeWarp(object):
def __init__(self, seq1, seq2):
self.warp_matrix = self.time_warp_matrix(seq1, seq2)
def time_warp_matrix(self, seq1, seq2):
output = np.zeros((len(seq1), len(seq2)), dtype=np.float64)
for i in range(len(seq1)):
for j in range(len(seq2)):
output[i][j] = np.sqrt((seq1[i] - seq2[j]) ** 2)
return outputĀ·
#lru_cache(maxsize=100)
def warp_path(self, i=None, j=None):
if (i is None) and (j is None):
i, j = self.warp_matrix.shape
i -= 1
j -= 1
distance = self.warp_matrix[i, j]
path = ((i, j),)
if i == j == 0:
return distance, path
potential = []
if i - 1 >= 0:
potential.append(self.warp_path(i-1, j))
if j - 1 >= 0:
potential.append(self.warp_path(i, j-1))
if (j - 1 >= 0) and (i - 1 >=0):
potential.append(self.warp_path(i-1, j-1))
if len(potential) > 0:
new_dist, new_path = min(potential, key = lambda x: x[0])
distance += new_dist
path = new_path + path
return distance, path
My questions:
Is this a valid implementation of DTW, as I believe?
Is there a better way to do this while maintaining the use of numpy arrays
and the recurrence relation?
If I end up having to use a class, and then wish to reuse an instance of the class (by passing it new sequences, and recalculating the warp_matrix), I will have to have some kind of dummy value passed as an argument to the warp_path function - as otherwise I imagine lru_cache will incorrectly return values. Is there some more elegant way around this problem?
While it is easy to think of DTW as a recursive function, it is possible to implement a iterative version. The iterative version is typically 10 to 30 times faster.
eamonn
Related
I am used to write code in c++ but now I am trying to learn python. I came to know about the Python language and it is very popular among everyone. So I thought, let's give it a shot.
Currently I am preparing for companies interview questions and able to solve most of them in c++. Alongside which, I am trying to write the code for the same in Python. For the things which I am not familiar with, I do a google search or watch tutorials etc.
While I was writing code for my previously solved easy interview questions in python, I encountered a problem.
Code : Given an array of integers, return indices of the two numbers such that they add up to a specific target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
Given an array of integers, print the indices of the two numbers such that they add up to a specific target.
def twoNum(*arr, t):
cur = 0
x = 0
y = 0
for i in range (len(arr) - 1):
for j in range (len(arr) - 1):
if(i == j):
break
cur = arr[i] + arr[j]
if(t == cur):
x = arr[i]
y = arr[j]
break
if(t == cur):
break
print(f"{x} + {y} = {x+y} ")
arr = [3, 5, -4, 8, 11, 1, -1, 6]
target = 10
twoNum(arr, t=target)
So here is the problem: I have defined x, y in function and then used x = arr[i] and y = arr[j] and I m printing those values.
output coming is : is 0 + 0 = 10 (where target is 10)
This is I guess probably because I am using x = 0 and y = 0 initially in the function and it seems x and y values are not updating then I saw outline section in VSCode there I saw x and y are declared twice, once at the starting of the function and second in for loop.
Can anyone explain to me what is going on here?
For reference, here is an image of the code I wrote in C++
Change this:
def twoNum(*arr, t):
to this:
def twoNum(arr, t):
* is used to indicate that there will be a variable number of arguments, see this. It is not for pointers as in C++.
Basically what you are trying to do is to write C code in python.
I would instead try to focus first on how to write python code in a 'pythonic' way first. But for your question - sloving it your way using brute force in python:
In [173]: def two_num(arr, t):
...: for i in arr:
...: for j in arr[i + 1: ]:
...: if i + j == t:
...: print(f"{i} + {j} = {t}")
...: return
Here's a way to implement a brute force approach using a list comprehension:
arr = [1,3,5,7,9]
target = 6
i,j = next((i,j) for i,n in enumerate(arr[:-1]) for j,m in enumerate(arr[i+1:],i+1) if n+m==target)
output:
print(f"arr[{i}] + arr[{j}] = {arr[i]} + {arr[j]} = {target}")
# arr[0] + arr[2] = 1 + 5 = 6
Perhaps even more pythonic would be to use iterators:
from itertools import tee
iArr = enumerate(arr)
i,j = next((i,j) for i,n in iArr for j,m in tee(iArr,1)[0] if n+m==target)
When you get to implementing an O(n) solution, you should look into dictionaries:
d = { target-n:j for j,n in enumerate(arr) }
i,j = next( (i,d[m]) for i,m in enumerate(arr) if m in d and d[m] != i )
I'm baffled. I just ported my code from Java to Python. Goods news is the Python alternative for the lib I'm using is much quicker. Bad part is that my custom processing code is much slower with the Python alternative I wrote :( I even removed some parts I deemed unnecessary, still much slower. The Java version took about half a second, Python takes 5-6.
rimg1 = imageio.imread('test1.png').astype(np.uint8)
rimg2 = imageio.imread('test2.png').astype(np.uint8)
sum_time = 0
for offset in range(-left, right):
rdest = np.zeros((h, w, 3)).astype(np.uint8)
if offset == 0:
continue
mult = np.uint8(1.0 / (offset * multiplier / frames))
for y in range(h):
for x in range(0, w - backup, 1):
slice_time = time.time()
src = rimg2[y,x] // mult + 1
sum_time += time.time() - slice_time
pix = rimg1[y,x + backup]
w ~= 384 and h ~= 384
src ranges from 0 - 30 usually.
left to right is -5 to 5
How come sum_time takes about a third of my total time?
Edit
With the help of josephjscheidt I made some changes.
mult = np.uint8(1.0 / (offset * multiplier / frames))
multArray = np.floor_divide(rimg2, mult) + 1
for y in range(h):
pixy = rimg1[y]
multy = multArray[y]
for x in range(0, w - backup, 1):
src = multy[y]
slice_time = time.time()
pix = pixy[x + backup]
sum_time += time.time() - slice_time
ox = x
for o in range(src):
if ox < 0:
break
rdest[y,ox] = pix
ox-=1
Using the numpy iterator for the srcArray cuts total time almost in half! The numpy operation itself seems to take negligible time.
Now most of the time taken is in rimg1 lookup
pix = rimg1[x + backup]
and the inner for loop (both taking 50% of time). Is it possible to handle this with numpy operations as well?
Edit
I would figure rewriting it could be of benefit, but somehow the following actually takes a little bit longer:
for x in range(0, w - backup, 1):
slice_time = time.time()
lastox = max(x-multy[y], 0)
rdest[y,lastox:x] = pixy[x + backup]
sum_time += time.time() - slice_time
Edit
slice_time = time.time()
depth = multy[y]
pix = pixy[x + backup]
ox = x
#for o in range(depth):
# if ox < 0:
# break;
#
# rdesty[ox] = pix
# ox-=1
# if I uncomment the above lines, and comment out the following two
# it takes twice as long!
lastox = max(x-multy[y], 0)
rdesty[lastox:x] = pixy[x + backup]
sum_time += time.time() - slice_time
The python interpreter is strange..
Time taken is now 2.5 seconds for sum_time. In comparison, Java does it in 60ms
For loops are notoriously slow with numpy arrays, and you have a three-layer for loop here. The underlying concept with numpy arrays is to perform operations on the entire array at once, rather than trying to iterate over them.
Although I can't entirely interpret your code, because most of the variables are undefined in the code chunk you provided, I'm fairly confident you can refactor here and vectorize your commands to remove the loops. For instance, if you redefine offset as a one-dimensional array, then you can calculate all values of mult at once without having to invoke a for loop: mult will become a one-dimensional array holding the correct values. We can avoid dividing by zero using the out argument (setting the default output to the offset array) and where argument (performing the calculation only where offset doesn't equal zero):
mult = np.uint8(np.divide(1.0, (offset * multiplier / frames),
out = offset, where = (offset != 0))
Then, to use the mult array on the rimg2 row by row, you can use a broadcasting trick (here, I'm assuming you want to add one to each element in rimg2):
src = np.floor_divide(rimg2, mult[:,None], out = rimg2, where = (mult != 0)) + 1
I found this article extremely helpful when learning how to effectively work with numpy arrays:
https://realpython.com/numpy-array-programming/
Since you are working with images, you may want to especially pay attention to the section on image feature extraction and stride_tricks. Anyway, I hope this helps you get started.
Suppose we have a defined function as following, and we would like to iterate over n from 1 to L, I've suffered a lot for a vectorization code, since this code is rather slow due to for loop needed outside to call this function.
Details: L, K are large integers e.g. 1000 and H_n is float value.
def multifrac_Brownian_motion(n, L, K, list_hurst, ind_hurst):
t_ks = np.asarray(sorted(-np.array(range(1, K + 1))*(1./L)))
t_ns = np.linspace(0, 1, num=L+1)
t_n = t_ns[n]
chi_k = np.random.randn(K)
chi_lminus1 = np.random.randn(L)
H_n = get_hurst_value(t_n, list_hurst, ind_hurst)
part1 = 1./(np.random.gamma(0.5 + H_n))
sums1 = np.dot((t_n - t_ks)**(H_n - 0.5) - ((-t_ks)**(H_n - 0.5)), chi_k)
sums2 = np.dot((t_n - t_ns[:n])**(H_n - 0.5), chi_lminus1[:n])
return part1*(1./np.sqrt(L))*(sums1 + sums2)
for n in range(1, L + 1):
onelist.append(multifrac_Brownian_motion(n, L, K, list_hurst, ind_hurst=ind_hurst))
Update:
def list_hurst_funcs(M, seg_size=10):
"""Generate a list of Hurst function components
Args:
M: Int, number of hurst functions
seg_size: Int, number of segmentations of interval [0, 1]
Returns:
list_hurst: List, list of hurst function components
"""
list_hurst = []
for i in range(M):
seg_points = sorted(np.random.uniform(size=seg_size))
funclist = np.random.uniform(size=seg_size + 1)
list_hurst.append((seg_points, funclist))
return list_hurst
def get_hurst_value(x, list_hurst, ind):
if np.isscalar(x):
x = np.array(float(x), ndmin=1)
seg_points, funclist = list_hurst[ind]
condlist = [x < seg_points[0]] +\
[(x >= seg_points[s] and x < seg_points[s + 1])
for s in range(len(seg_points) - 1)] +\
[x >= seg_points[-1]]
return np.piecewise(x, condlist=condlist, funclist=funclist)
One way to tackle a problem like this is to (try) understand the big picture, and come with a different approach that treats everything as 2d or larger (LxK arrays). Another is to examine the multifrac_Brownian_motion, trying to speed it up, and where possible eliminate steps that depend on scalars or 1d arrays. In other words, work from the inside out. If we get enough of a speed improvement it may not matter that we have to call it in a loop. Even better the improvement suggests ways of operating in high dimensions.
As a start from inside out, I'd suggest replacing the t_ks calc with:
t_ks = -np.arange(K,0,-1)/L # 1./L if required by Py2 integer division
Since list_hurst, ind_hurst are the same for all n, I suspect you can move some time consuming parts of get_hurst_value outside the loop.
But I'd put most effort into improving that condlist construction. That's list comprehension buried deep inside your outer loop.
piecewise also loops over those seg_points.
I am trying to revamp a function that uses the Pollard Rho method to factor an integer but my attempt at using memoize has had no improvement in being able to factor a specific number (N=7331117) that this function should be able to facotr.
Before attempt:
import fractions
def pollard_Rho(n):
def f(xn):
if xn == 0:
return 2
return f(xn - 1) ** 2 + 1
i = 0
x = f(i)
y = f(f(i))
d = fractions.gcd(abs(x - y), n)
while d == 1:
i = i + 1
d = fractions.gcd(abs(x - y), n)
root1 = d
root2 = n / d
print i + 1
return (root1, root2)
memoize attempt:
def pollard_Rho(n):
class memoize:
def __init__(self, function):
self.function = function
self.memoized = {}
def __call__(self, *args):
try:
return self.memoized[args]
except KeyError:
self.memoized[args] = self.function(*args)
return self.memoized[args]
#memoize
def f(xn):
if xn == 0:
return 2
return f(xn - 1) ** 2 + 1
i = 0
x = f(i)
y = f(f(i))
d = fractions.gcd(abs(x - y), n)
while d == 1:
i = i + 1
d = fractions.gcd(abs(x - y), n)
root1 = d
root2 = n / d
print i + 1
return (root1, root2)
Now neither code produces any errors but both codes also do produce any results.
The output of
print pollard_Rho(7331117)
should be (641, 11437) (I know this because of another factorization function I have written) but what actually happens is the code runs through 3 iterations of the while loop and nothing happens afterwards. Does anyone have any suggestions?
Sorry for the vague question, does anyone have any suggestions on improving the the codes ability to factor in general? Maybe by a method more efficient than a recursive function? 7331116 and 7331118 factor perfectly fine and only 7331117 seems to be a tough nut to crack so far using this method.
Its possible I didn't use memoize right because even with looking at at on of stackoverflow examples I don't really understand how to use it. It seems every single instance of it I came across was drastically different.
It seems like your algorithm does not work for some reason. In order to see what is going on I went to wikipedia site of the algorithm and implemented regular version from there and it worked without a problem. Than I replaced my g function with your recursive version and I got following error
File "rho.py", line 25, in f_fun
return 2 if xn == 0 else f_fun(xn - 1) ** 2 + 1
RecursionError: maximum recursion depth exceeded
It seems like you cannot implement this with a regular recursion. I would suggest to convert your recursion to a fold or a generator.
Here is the code I tried:
https://gist.github.com/huseyinyilmaz/73c1ac42b2a20d24d3b5
UPDATE:
Here is your version with cache, it still have maximum depth problem. (python 2 implementation)
https://gist.github.com/huseyinyilmaz/bb26ac172fbec4c655d3
I am newbie in Python. I'm stuck on doing Problem 15 in Project-Euler in reasonable time. The problem in memoize func. Without memoize all working good, but only for small grids. I've tried to use Memoization, but result of such code is "1" for All grids.
def memoize(f): #memoization
memo = {}
def helper(x):
if x not in memo:
memo[x] = f(x)
return memo[x]
return helper
#memoize
def search(node):
global route
if node[0] >= k and node[1] >= k:
route += 1
return route
else:
if node[0] < k + 1 and node[1] < k + 1:
search((node[0] + 1, node[1]))
search((node[0], node[1] + 1))
return route
k = 2 #grid size
route = 0
print(search((0, 0)))
If commenting out code to disable memoize func:
##memoize
all works, but to slow for big grids. What am i doing wrong? Help to debbug. Thx a lot!
Update1:
Thank for your help, I've found answer too:
def memoize(f):
memo = {}
def helper(x):
if x not in memo:
memo[x] = f(x)
return memo[x]
return helper
#memoize
def search(node):
n = 0
if node[0] == k and node[1] == k:
return 1
if node[0] < k+1 and node[1] < k+1:
n += search((node[0] + 1, node[1]))
n += search((node[0], node[1] + 1))
return n
k = 20
print(search((0, 0)))
Problem was not in memoize func as i thought before. Problem was in 'search' function. Whithout globals it wroiking right i wished. Thx for comments, they was really usefull.
Your memoization function is fine, at least for this problem. For the more general case, I'd use this:
def memoize(f):
f.cache = {} # - one cache for each function
def _f(*args, **kwargs): # - works with arbitrary arguments
if args not in f.cache: # as long as those are hashable
f.cache[args] = f(*args, **kwargs)
return f.cache[args]
return _f
The actual problem -- as pointed out by Kevin in the comments -- is that memoization only works if the function does not work via side effects. While your function does return the result, you do not use this in the recursive calculation, but just rely on incrementing the global counter variable. When you get an earlier result via memoization, that counter is not increased any further, and you do not use the returned value, either.
Change your function to sum up the results of the recursive calls, then it will work.
You can also simplify your code somewhat. Particularly, the if check before the recursive call is not necessary, since you check for >= k anyway, but then you should check whether the x component or the y component is >= k, not both; once either has hit k, there's just one more route to the goal. Also, you could try to count down to 0 instead of up to k so the code does not need k anymore.
#memoize
def search(node):
x, y = node
if x <= 0 or y <= 0:
return 1
return search((x - 1, y)) + search((x, y - 1))
print(search((20, 20)))
Try this code. It works fast even with grids over 1000x1000! Not nessesarily square.
But I didn't know about memoization yet...
import time
def e15():
x=int(input("Enter X of grid: "))
y=int(input("Enter Y of grid: "))
start = time.time()
lst=list(range(1,x+2))
while lst[1]!=y+1:
i=0
for n in lst[1:]:
i+=1
lst[i]=n+lst[i-1]
print(f"There are {lst[-1]} routes in {x}x{y} grid!")
end = time.time() - start
print("Runtime =", end)
e15()
This problem can be solved in O(1) time by using the code below:
from math import factorial as f
n, m = map(int, input("Enter dimensions (separate by space)?").split())
print ("Routes through a", n, "x", m, "grid", f(n+m) // f(n) // f(m))
Here's a link for a proof of the equation:
Project Euler Problem 15 Solution