I'm trying to optimize the function 'pw' in the following code using only NumPy functions (or perhaps list comprehensions).
from time import time
import numpy as np
def pw(x, udata):
"""
Creates the step function
| 1, if d0 <= x < d1
| 2, if d1 <= x < d2
pw(x,data) = ...
| N, if d(N-1) <= x < dN
| 0, otherwise
where di is the ith element in data.
INPUT: x -- interval which the step function is defined over
data -- an ordered set of data (without repetitions)
OUTPUT: pw_func -- an array of size x.shape[0]
"""
vals = np.arange(1,udata.shape[0]+1).reshape(udata.shape[0],1)
pw_func = np.sum(np.where(np.greater_equal(x,udata)*np.less(x,np.roll(udata,-1)),vals,0),axis=0)
return pw_func
N = 50000
x = np.linspace(0,10,N)
data = [1,3,4,5,5,7]
udata = np.unique(data)
ti = time()
pw(x,udata)
tf = time()
print(tf - ti)
import cProfile
cProfile.run('pw(x,udata)')
The cProfile.run is telling me that most of the overhead is coming from np.where (about 1 ms) but I'd like to create faster code if possible. It seems that performing the operations row-wise versus column-wise makes some difference, unless I'm mistaken, but I think I've accounted for it. I know that sometimes list comprehensions can be faster but I couldn't figure out a faster way than what I'm doing using it.
Searchsorted seems to yield better performance but that 1 ms still remains on my computer:
(modified)
def pw(xx, uu):
"""
Creates the step function
| 1, if d0 <= x < d1
| 2, if d1 <= x < d2
pw(x,data) = ...
| N, if d(N-1) <= x < dN
| 0, otherwise
where di is the ith element in data.
INPUT: x -- interval which the step function is defined over
data -- an ordered set of data (without repetitions)
OUTPUT: pw_func -- an array of size x.shape[0]
"""
inds = np.searchsorted(uu, xx, side='right')
vals = np.arange(1,uu.shape[0]+1)
pw_func = vals[inds[inds != uu.shape[0]]]
num_mins = np.sum(xx < np.min(uu))
num_maxs = np.sum(xx > np.max(uu))
pw_func = np.concatenate((np.zeros(num_mins), pw_func, np.zeros(xx.shape[0]-pw_func.shape[0]-num_mins)))
return pw_func
This answer using piecewise seems pretty close, but that's on a scalar x0 and x1. How would I do it on arrays? And would it be more efficient?
Understandably, x may be pretty big but I'm trying to put it through a stress test.
I am still learning though so some hints or tricks that can help me out would be great.
EDIT
There seems to be a mistake in the second function since the resulting array from the second function doesn't match the first one (which I'm confident that it works):
N1 = pw1(x,udata.reshape(udata.shape[0],1)).shape[0]
N2 = np.sum(pw1(x,udata.reshape(udata.shape[0],1)) == pw2(x,udata))
print(N1 - N2)
yields
15000
data points that are not the same. So it seems that I don't know how to use 'searchsorted'.
EDIT 2
Actually I fixed it:
pw_func = vals[inds[inds != uu.shape[0]]]
was changed to
pw_func = vals[inds[inds[(inds != uu.shape[0])*(inds != 0)]-1]]
so at least the resulting arrays match. But the question still remains on whether there's a more efficient way of going about doing this.
EDIT 3
Thanks Tin Lai for pointing out the mistake. This one should work
pw_func = vals[inds[(inds != uu.shape[0])*(inds != 0)]-1]
Maybe a more readable way of presenting it would be
non_endpts = (inds != uu.shape[0])*(inds != 0) # only consider the points in between the min/max data values
shift_inds = inds[non_endpts]-1 # searchsorted side='right' includes the left end point and not right end point so a shift is needed
pw_func = vals[shift_inds]
I think I got lost in all those brackets! I guess that's the importance of readability.
A very abstract yet interesting problem! Thanks for entertaining me, I had fun :)
p.s. I'm not sure about your pw2 I wasn't able to get it output the same as pw1.
For reference the original pws:
def pw1(x, udata):
vals = np.arange(1,udata.shape[0]+1).reshape(udata.shape[0],1)
pw_func = np.sum(np.where(np.greater_equal(x,udata)*np.less(x,np.roll(udata,-1)),vals,0),axis=0)
return pw_func
def pw2(xx, uu):
inds = np.searchsorted(uu, xx, side='right')
vals = np.arange(1,uu.shape[0]+1)
pw_func = vals[inds[inds[(inds != uu.shape[0])*(inds != 0)]-1]]
num_mins = np.sum(xx < np.min(uu))
num_maxs = np.sum(xx > np.max(uu))
pw_func = np.concatenate((np.zeros(num_mins), pw_func, np.zeros(xx.shape[0]-pw_func.shape[0]-num_mins)))
return pw_func
My first attempt was utilising a lot of boardcasting operation from numpy:
def pw3(x, udata):
# the None slice is to create new axis
step_bool = x >= udata[None,:].T
# we exploit the fact that bools are integer value of 1s
# skipping the last value in "data"
step_vals = np.sum(step_bool[:-1], axis=0)
# for the step_bool that we skipped from previous step (last index)
# we set it to zerp so that we can negate the step_vals once we reached
# the last value in "data"
step_vals[step_bool[-1]] = 0
return step_vals
After looking at the searchsorted from your pw2 I had a new approach that utilise it with much higher performance:
def pw4(x, udata):
inds = np.searchsorted(udata, x, side='right')
# fix-ups the last data if x is already out of range of data[-1]
if x[-1] > udata[-1]:
inds[inds == inds[-1]] = 0
return inds
Plots with:
plt.plot(pw1(x,udata.reshape(udata.shape[0],1)), label='pw1')
plt.plot(pw2(x,udata), label='pw2')
plt.plot(pw3(x,udata), label='pw3')
plt.plot(pw4(x,udata), label='pw4')
with data = [1,3,4,5,5,7]:
with data = [1,3,4,5,5,7,11]
pw1,pw3,pw4 are all identical
print(np.all(pw1(x,udata.reshape(udata.shape[0],1)) == pw3(x,udata)))
>>> True
print(np.all(pw1(x,udata.reshape(udata.shape[0],1)) == pw4(x,udata)))
>>> True
Performance: (timeit by default runs 3 times, average of number=N of times)
print(timeit.Timer('pw1(x,udata.reshape(udata.shape[0],1))', "from __main__ import pw1, x, udata").repeat(number=1000))
>>> [3.1938983199979702, 1.6096494779994828, 1.962694135003403]
print(timeit.Timer('pw2(x,udata)', "from __main__ import pw2, x, udata").repeat(number=1000))
>>> [0.6884554479984217, 0.6075002400029916, 0.7799002879983163]
print(timeit.Timer('pw3(x,udata)', "from __main__ import pw3, x, udata").repeat(number=1000))
>>> [0.7369808239964186, 0.7557657590004965, 0.8088172269999632]
print(timeit.Timer('pw4(x,udata)', "from __main__ import pw4, x, udata").repeat(number=1000))
>>> [0.20514375300263055, 0.20203858999957447, 0.19906871100101853]
Related
I'm making a trading strategy that uses support and resistance levels. One of the ways i'm finding those is by searching for maxima's/minima's (prices that are higher/lower than the previous and next 5 prices).
I have an array of smoothed closing prices and i first tried to find them with a for loop :
def find_max_min(smoothed_prices) # smoothed_prices = np.array([1.873,...])
avg_delta = np.diff(smoothed_prices).mean()
maximas = []
minimas = []
for index in range(len(smoothed_prices)):
if index < 5 or index > len(smoothed_prices) - 6:
continue
current_value = smoothed_prices[index]
previous_points = smoothed_prices[index - 5:index]
next_points = smoothed_prices [index+1:index+6]
previous_are_higher = all(x > current_value for x in previous_points)
next_are_higher = all(x > current_value for x in next_points)
previous_are_smaller = all(x < current_value for x in previous_points)
next_are_smaller = all(x < current_value for x in next_points)
previous_delta_is_enough = abs(previous[0] - current_value) > avg_delta
next_delta_is_enough = abs(next_points[-1] - current_value) > avg_delta
delta_is_enough = previous_delta_is_enough and next_delta_is_enough
if previous_are_higher and next_are_higher and delta_is_enough:
minimas.append(current_value)
elif previous_are_higher and next_are_higher and delta_is_enough:
maximas.append(current_value)
else:
continue
return maximas, minimas
(This isn't the actual code that i used because i erased it, this may not work but is was something like that)
So this code could find the maximas and minimas but it was way too slow and i need to use the function multiple times per secs on huge arrays.
My question is : is it possible to do it with a numpy mask in a similar way as this :
smoothed_prices = s
minimas = s[all(x > s[index] for x in s[index-5:index]) and all(x > s[index] for x in s[index+1:index+6])]
maximas = ...
or do you know how i could to it in another efficient numpy way ?
I have thought of a way, it should be faster than the for loop you presented, but it uses more memory. Simply put, it creates a intermediate matrix of windows, then it just gets the max and min of each window:
def find_max_min(arr, win_pad_size=5):
windows = np.zeros((len(arr) - 2 * win_pad_size, 2 * win_pad_size + 1))
for i in range(2 * win_pad_size + 1):
windows[:, i] = arr[i:i+windows.shape[0]]
return windows.max(axis=1), windows.min(axis=1)
Edit: I found a faster way to calculate the sub-sequences (I had called windows) from Split Python sequence into subsequences. It doesn't use more memory, instead, it creates a view of the array.
def subsequences(ts, window):
shape = (ts.size - window + 1, window)
strides = ts.strides * 2
return np.lib.stride_tricks.as_strided(ts, shape=shape, strides=strides)
def find_max_min(arr, win_pad_size=5):
windows = subsequences(arr, 2 * win_pad_size + 1)
return windows.max(axis=1), windows.min(axis=1)
You can do it easily by:
from skimage.util import view_as_windows
a = smoothed_prices[4:-5]
a[a == view_as_windows(smoothed_prices, (10)).min(-1)]
Please note that since you are looking at minimas within +/- 5 of the index, they can be in indices [4:-5] of your array.
I do have a piece of code that compute partitions of a set of (potentialy duplicated) integers. But i am interested in the set of possible partition and there multiplicity.
You can for exemple launch the follwoing code :
import numpy as np
from collections import Counter
import pandas as pd
def _B(i):
# for a given multiindex i, we defined _B(i) as the set of integers containg i_j times the number j:
if len(i) != 1:
B = []
for j in range(len(i)):
B.extend(i[j]*[j])
else:
B = i*[0]
return B
def _partition(collection):
# from here: https://stackoverflow.com/a/62532969/8425270
if len(collection) == 1:
yield (collection,)
return
first = collection[0]
for smaller in _partition(collection[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + ((first,) + subset,) + smaller[n + 1 :]
# put `first` in its own subset
yield ((first,),) + smaller
def to_list(tpl):
# the final hierarchy is
return list(list(i) if isinstance(i, tuple) else i for i in tpl)
def _Pi(inst_B):
# inst_B must be a tuple
if type(inst_B) != tuple :
inst_B = tuple(inst_B)
pp = [tuple(sorted(p)) for p in _partition(inst_B)]
c = Counter(pp)
Pi = c.keys()
N = list()
for pi in Pi:
N.append(c[pi])
Pi = [to_list(pi) for pi in Pi]
return Pi, N
if __name__ == "__main__":
import cProfile
pr = cProfile.Profile()
pr.enable()
sh = (3, 3, 3)
rez = list()
rez_sorted= list()
rez_ref = list()
for idx in np.ndindex(sh):
if sum(idx) > 0:
print(idx)
Pi, N = _Pi(_B(idx))
print(pd.DataFrame({'Pi': Pi, 'N': N * np.array([np.math.factorial(len(pi) - 1) for pi in Pi])}))
pr.disable()
# after your program ends
pr.print_stats(sort="tottime")
This code computes, for several examples of tuples of integer numbers (generated by np.ndindex) the partitions and counts i need. Everything happens in the _partition and the _Pi functions, this is were you should look at.
If you look closely at how these two functions are working, you'll see that they comput eevery potential partition and THEN count up how many times they appeared. For small problems, this is fine, but if the size of the prolbme increase, this starts to take a looooot of time. Try setting sh = (5,5,5), you'll see what i mean;
So the problem is the following :
Is there a way to compute directly the partitions and there number of occurences instead ?
Edit: I cross-posted on mathoverflow there, and they propose a solution in this article, in corrolary 2.10 (page 10 of the pdf). The problem could be solved by implmenting the sets p(v,r) in this corrolary.
I was hoping, as in the univariate case, that those sets would have a nice recursive expression but i ould not find one yet.
More Edit : This problem is equivalent to finding all (multiset)-partitions of a multiset. If the solution for finding (set)-partitions of a set is given by Bell partial polynomials, here we need multivariate version of these polynomials.
I want to calculate the relative rank of each element in an array among elements before it. For example in an array [2,1,4,3], the relative rank (from small to large) of the second element (1) among a subset array of [2,1] is 1. The relative rank of the third element (4) among a subset array of [2,1,4] is 3. The final relative rank of each element should be [1,1,3,3].
I'm using the following python code:
x = np.array([2,1,4,3])
rr = np.ones(4)
for i in range(1,4):
rr[i] = sum(x[i] >= x[:i+1])
Are there any other faster ways?
Not sure if it's faster, but you can do this with a list comprehension, which always brightens my day:
[sorted(x[:i+1]).index(v)+1 for i, v in enumerate(x)]
Here's a vectorized way with broadcasting -
n = len(x)
m1 = x[1:,None]>=x
m2 = np.tri(n-1,n,k=1, dtype=bool)
rr[1:] = (m1 & m2).sum(1)
Alternatively, we could bring in einsum or np.matmul to do the last step of sum-reduction -
(m1.astype(np.float32)[:,None,:] # m2[:,:,None])[:,0,0]
np.einsum('ij,ij->i',m1.astype(np.float32),m2)
Your current algorithm takes quadratic time, which isn't going to scale to large inputs. You can do a lot better.
One way to do better would be to use a sorted data structure, like sortedcontainers.SortedList, and perform a series of lookups and insertions. The following example implementation returns a list, assumes no ties, and starts ranks from 0:
import sortedcontainers
def rank(nums):
sortednums = sortedcontainers.SortedList()
ranks = []
for num in nums:
ranks.append(sortednums.bisect_left(num))
sortednums.add(num)
return ranks
Most of the work is inside the SortedList implementation, and SortedList is pretty fast, so this shouldn't have too much Python overhead. The existence of sortedcontainers definitely makes this more convenient than the next option, if not necessarily more efficient.
This option runs in... O(n log n)-ish time. SortedList uses a two-layer hierarchy instead of a traditional tree structure, making a deliberate tradeoff of more data movement for less pointer chasing, so insertion isn't theoretically O(log n), but it's efficient in practice.
The next option would be to use an augmented mergesort. If you do this, you're going to want to use Numba or Cython, because you'll have to write the loops manually.
The basic idea is to do a mergesort, but tracking the rank of each element in its subarray as you go. When you merge two sorted subarrays, each element on the left side keeps its old rank, while the rank values for elements on the right side get adjusted upward for how many elements on the left were less than them.
This option runs in O(n log n).
An unoptimized implementation operating on Python lists, assuming no ties, and starting ranks at 0, would look like this:
def rank(nums):
_, indexes, ranks = _augmented_mergesort(nums)
result = [None]*len(nums)
for i, rank_ in zip(indexes, ranks):
result[i] = rank_
return result
def _augmented_mergesort(nums):
# returns sorted nums, indexes of sorted nums in original nums, and corresponding ranks
if len(nums) == 1:
return nums, [0], [0]
left, right = nums[:len(nums)//2], nums[len(nums)//2:]
return _merge(*_augmented_mergesort(left), *_augmented_mergesort(right))
def _merge(lnums, lindexes, lranks, rnums, rindexes, rranks):
nums, indexes, ranks = [], [], []
i_left = i_right = 0
def add_from_left():
nonlocal i_left
nums.append(lnums[i_left])
indexes.append(lindexes[i_left])
ranks.append(lranks[i_left])
i_left += 1
def add_from_right():
nonlocal i_right
nums.append(rnums[i_right])
indexes.append(rindexes[i_right] + len(lnums))
ranks.append(rranks[i_right] + i_left)
i_right += 1
while i_left < len(lnums) and i_right < len(rnums):
if lnums[i_left] < rnums[i_right]:
add_from_left()
elif lnums[i_left] > rnums[i_right]:
add_from_right()
else:
raise ValueError("Tie detected")
if i_left < len(lnums):
nums += lnums[i_left:]
indexes += lindexes[i_left:]
ranks += lranks[i_left:]
else:
while i_right < len(rnums):
add_from_right()
return nums, indexes, ranks
For an optimized implementation, you'd want an insertion sort base case, you'd want to use Numba or Cython, you'd want to operate on arrays, and you'd want to not do so much allocation.
You are all my hero! doing great job!. I'd like to show you the comparison of each of your solution:
import numpy as np
import time
import sortedcontainers
def John(x):
n=len(x)
rr=np.ones(n)
for i in range(1,n):
rr[i]=sum(x[i]>=x[:i+1])
return rr
def Matvei(x):
return [sorted(x[:i+1]).index(v)+1 for i, v in enumerate(x)]
def Divarkar1(x):
n = len(x)
m1 = x[1:,None]>=x
m2 = np.tri(n-1,n,k=1, dtype=bool)
rr[1:] = (m1 & m2).sum(1)
return rr
def Divarkar2(x):
n = len(x)
m1 = x[1:,None]>=x
m2 = np.tri(n-1,n,k=1, dtype=bool)
(m1.astype(np.float32)[:,None,:] # m2[:,:,None])[:,0,0]
rr[1:]=np.einsum('ij,ij->i',m1.astype(np.float32),m2)
return rr
def Monica(x):
sortednums = sortedcontainers.SortedList()
ranks = []
for num in x:
ranks.append(sortednums.bisect_left(num))
sortednums.add(num)
return np.array(ranks)+1
x=np.random.rand(4000)
t1=time.time()
rr=John(x)
t2=time.time()
print(t2-t1)
#print(rr)
t1=time.time()
rr=Matvei(x)
t2=time.time()
print(t2-t1)
#print(rr)
t1=time.time()
rr=Divarkar1(x)
t2=time.time()
print(t2-t1)
#print(rr)
t1=time.time()
rr=Divarkar2(x)
t2=time.time()
print(t2-t1)
#print(rr)
t1=time.time()
rr=Monica(x)
t2=time.time()
print(t2-t1)
#print(rr)
The results are:
19.5
2.9
0.079
0.25
0.017
I ran several times and results are similar. The best one is Monica's algorithm!
Many thanks to everyone!
John
when I converted all algorithms into numpy 2D array, I found my algorithm is the best. Of course the performance also depends on the dimension of 2D array. But 380x900 is my case. I think Numpy array calculation benefits it a lot. Here are codes:
import numpy as np
import time
import sortedcontainers
def John(x): #x is 1D array
n=len(x)
rr=[]
for i in range(n):
rr.append(np.sum(x[i]>=x[:i+1]))
return np.array(rr)
def John_2D(rv): #rv is 2d numpy array. rank it along axis 1!
nr,nc=rv.shape
rr=[]
for i in range(nc):
rr.append(np.sum((rv[:,:i+1]<=rv[:,i:i+1]),axis=1))
return np.array(rr).T
def Matvei(x): #x is 1D array
return [sorted(x[:i+1]).index(v)+1 for i, v in enumerate(x)]
def Divarkar1(x):#x is 1D array
n = len(x)
rr=np.ones(n,dtype=int)
m1 = x[1:,None]>=x
m2 = np.tri(n-1,n,k=1, dtype=bool)
rr[1:] = (m1 & m2).sum(1)
return rr
def Divarkar2(x):#x is 1D array
n = len(x)
rr=np.ones(n,dtype=int)
m1 = x[1:,None]>=x
m2 = np.tri(n-1,n,k=1, dtype=bool)
(m1.astype(np.float32)[:,None,:] # m2[:,:,None])[:,0,0]
rr[1:]=np.einsum('ij,ij->i',m1.astype(np.float32),m2)
return rr
def Monica1(nums): #nums is 1D array
sortednums = sortedcontainers.SortedList()
ranks = []
for num in nums:
ranks.append(sortednums.bisect_left(num))
sortednums.add(num)
return np.array(ranks)+1
def Monica2(nums): #nums is 1D array
_, indexes, ranks = _augmented_mergesort(nums)
result = [None]*len(nums)
for i, rank_ in zip(indexes, ranks):
result[i] = rank_
return np.array(result)+1
def _augmented_mergesort(nums): #nums is 1D array
# returns sorted nums, indexes of sorted nums in original nums, and corresponding ranks
if len(nums) == 1:
return nums, [0], [0]
left, right = nums[:len(nums)//2], nums[len(nums)//2:] #split the array by half
return _merge(*_augmented_mergesort(left), *_augmented_mergesort(right))
def _merge(lnums, lindexes, lranks, rnums, rindexes, rranks):
nums, indexes, ranks = [], [], []
i_left = i_right = 0
def add_from_left():
nonlocal i_left
nums.append(lnums[i_left])
indexes.append(lindexes[i_left])
ranks.append(lranks[i_left])
i_left += 1
def add_from_right():
nonlocal i_right
nums.append(rnums[i_right])
indexes.append(rindexes[i_right] + len(lnums))
ranks.append(rranks[i_right] + i_left)
i_right += 1
while i_left < len(lnums) and i_right < len(rnums):
if lnums[i_left] < rnums[i_right]:
add_from_left()
elif lnums[i_left] > rnums[i_right]:
add_from_right()
else:
raise ValueError("Tie detected")
if i_left < len(lnums):
while i_left < len(lnums):
add_from_left()
#nums += lnums[i_left:]
#indexes += lindexes[i_left:]
#ranks += lranks[i_left:]
else:
while i_right < len(rnums):
add_from_right()
return nums, indexes, ranks
def rank_2D(f,nums): #f is method, nums is 2D numpy array
result=[]
for x in nums:
result.append(f(x))
return np.array(result)
x=np.random.rand(6000)
for f in [John, Matvei, Divarkar1, Divarkar2, Monica1, Monica2]:
t1=time.time()
rr=f(x)
t2=time.time()
print(f'{f.__name__+"_1D: ":16} {(t2-t1):.3f}')
print()
x=np.random.rand(380,900)
t1=time.time()
rr=John_2D(x)
t2=time.time()
print(f'{"John_2D:":16} {(t2-t1):.3f}')
#print(rr)
for f in [Matvei, Divarkar1, Divarkar2, Monica1, Monica2]:
t1=time.time()
rr=rank_2D(f,x)
t2=time.time()
print(f'{f.__name__+"_2D: ":16} {(t2-t1):.3f}')
#print(rr)
The typical results are:
John_1D: 0.069
Matvei_1D: 7.208
Divarkar1_1D: 0.163
Divarkar2_1D: 0.488
Monica1_1D: 0.032
Monica2_1D: 0.082
John_2D: 0.409
Matvei_2D: 49.044
Divarkar1_2D: 1.276
Divarkar2_2D: 4.065
Monica1_2D: 1.090
Monica2_2D: 3.571
For 1D array, Monica1 method is the best, but my numpy-version method is not too bad.
For 2D array, my numpy-version method is the best.
Welcome to test and comment.
Thanks
John
this probably leads to scipy/numpy, but right now I'm happy with any functionality as I couldn't find anything in those packages. I have a matrix that contains data for a multi-variate distribution (let's say, 2, for the fun of it). Is there any function to compute (higher) moments of that? All I could find was numpy.mean() and numpy.cov() :o
Thanks :)
/edit:
So some more detail: I have multivariate data, that is, a matrix where rows display variables and columns observations. Now I would like to have a simple way of computing the joint moments of that data, as defined in http://en.wikipedia.org/wiki/Central_moment#Multivariate_moments .
I'm pretty new to python/scipy so I'm not sure I'd be the best person to code this one up, especially for the n-variables case (note that the wikipedia definition is for n=2), and I kind of expected there to be some out-of-the-box thing to use as I thought this would be a standard problem.
/edit2:
Just for the future, in case someone wants to do something similar, the following code (which is still under review) should give the sample equivalent of the raw moments E(X^2), E(Y^2), etc. It only works for two variables right now, but it should be extendable if one feels the need. If you see some mistakes or unclean/unpython-nish code, feel free to comment.
from numpy import *
# this function should return something as
# moments[0] = 1
# moments[1] = mean(X), mean(Y)
# moments[2] = 1/n*X'X, 1/n*X'Y, 1/n*Y'Y
# moments[3] = mean(X'X'X), mean(X'X'Y), mean(X'Y'Y),
# mean(Y'Y'Y)
# etc
def getRawMoments(data, moment, axis=0):
a = moment
if (axis==0):
n = float(data.shape[1])
X = matrix(data[0,:]).reshape((n,1))
Y = matrix(data[1,:]).reshape((n,1))
else:
n = float(data.shape[0])
X = matrix(data[:,0]).reshape((n,1))
Y = matrix(data[:,1]).reshape((n,11))
result = 1
Z = hstack((X,Y))
iota = ones((1,n))
moments = {}
moments[0] = 1
#first, generate huge-ass matrix containing all x-y combinations
# for every power-combination k,l such that k+l = i
# for all 0 <= i <= a
for i in arange(1,a):
if i==2:
moments[i] = moments[i-1]*Z
# if even, postmultiply with X.
elif i%2 == 1:
moments[i] = kron(moments[i-1], Z.T)
# Else, postmultiply with X.T
elif i%2==0:
temp = moments[i-1]
temp2 = temp[:,0:n]*Z
temp3 = temp[:,n:2*n]*Z
moments[i] = hstack((temp2, temp3))
# since now we have many multiple moments
# such as x**2*y and x*y*x, filter non-distinct elements
momentsDistinct = {}
momentsDistinct[0] = 1
for i in arange(1,a):
if i%2 == 0:
data = 1/n*moments[i]
elif i == 1:
temp = moments[i]
temp2 = temp[:,0:n]*iota.T
data = 1/n*hstack((temp2))
else:
temp = moments[i]
temp2 = temp[:,0:n]*iota.T
temp3 = temp[:,n:2*n]*iota.T
data = 1/n*hstack((temp2, temp3))
momentsDistinct[i] = unique(data.flat)
return momentsDistinct(result, axis=1)
I would like to query the value of an exponentially weighted moving average at particular points. An inefficient way to do this is as follows. l is the list of times of events and queries has the times at which I want the value of this average.
a=0.01
l = [3,7,10,20,200]
y = [0]*1000
for item in l:
y[int(item)]=1
s = [0]*1000
for i in xrange(1,1000):
s[i] = a*y[i-1]+(1-a)*s[i-1]
queries = [23,68,103]
for q in queries:
print s[q]
Outputs:
0.0355271185019
0.0226018371526
0.0158992102478
In practice l will be very large and the range of values in l will also be huge. How can you find the values at the times in queries more efficiently, and especially without computing the potentially huge lists y and s explicitly. I need it to be in pure python so I can use pypy.
Is it possible to solve the problem in time proportional to len(l)
and not max(l) (assuming len(queries) < len(l))?
Here is my code for doing this:
def ewma(l, queries, a=0.01):
def decay(t0, x, t1, a):
from math import pow
return pow((1-a), (t1-t0))*x
assert l == sorted(l)
assert queries == sorted(queries)
samples = []
try:
t0, x0 = (0.0, 0.0)
it = iter(queries)
q = it.next()-1.0
for t1 in l:
# new value is decayed previous value, plus a
x1 = decay(t0, x0, t1, a) + a
# take care of all queries between t0 and t1
while q < t1:
samples.append(decay(t0, x0, q, a))
q = it.next()-1.0
# take care of all queries equal to t1
while q == t1:
samples.append(x1)
q = it.next()-1.0
# update t0, x0
t0, x0 = t1, x1
# take care of any remaining queries
while True:
samples.append(decay(t0, x0, q, a))
q = it.next()-1.0
except StopIteration:
return samples
I've also uploaded a fuller version of this code with unit tests and some comments to pastebin: http://pastebin.com/shhaz710
EDIT: Note that this does the same thing as what Chris Pak suggests in his answer, which he must have posted as I was typing this. I haven't gone through the details of his code, but I think mine is a bit more general. This code supports non-integer values in l and queries. It also works for any kind of iterables, not just lists since I don't do any indexing.
I think you could do it in ln(l) time, if l is sorted. The basic idea is that the non recursive form of EMA is a*s_i + (1-a)^1 * s_(i-1) + (1-a)^2 * s_(i-2) ....
This means for query k, you find the greatest number in l less than k, and for a estimation limit, use the following, where v is the index in l, l[v] is the value
(1-a)^(k-v) *l[v] + ....
Then, you spend lg(len(l)) time in search + a constant multiple for the depth of your estimation. I'll provide a code sample in a little bit (after work) if you want it, just wanted to get my idea out there while I was thinking about it
here's the code -
v is the dictionary of values at a given time; replace with 1 if it's just a 1 every time...
import math
from bisect import bisect_right
a = .01
limit = 1000
l = [1,5,14,29...]
def find_nearest_lt(l, time):
i = bisect_right(a, x)
if i:
return i-1
raise ValueError
def find_ema(l, time):
i = find_nearest_lt(l, time)
if l[i] == time:
result = a * v[l[i]
i -= 1
else:
result = 0
while (time-l[i]) < limit:
result += math.pow(1-a, time-l[i]) * v[l[i]]
i -= 1
return result
if I'm thinking correctly, the find nearest is l(n), then the while loop is <= 1000 iterations, guaranteed, so it's technically a constant (though a kind of large one). find_nearest was stolen from the page on bisect - http://docs.python.org/2/library/bisect.html
It appears that y is a binary value -- either 0 or 1 -- depending on the values of l. Why not use y = set(int(item) for item in l)? That's the most efficient way to store and look up a list of numbers.
Your code will cause an error the first time through this loop:
s = [0]*1000
for i in xrange(1000):
s[i] = a*y[i-1]+(1-a)*s[i-1]
because i-1 is -1 when i=0 (first pass of loop) and both y[-1] and s[-1] are the last element of the list, not the previous. Maybe you want xrange(1,1000)?
How about this code:
a=0.01
l = [3.0,7.0,10.0,20.0,200.0]
y = set(int(item) for item in l)
queries = [23,68,103]
ewma = []
x = 1 if (0 in y) else 0
for i in xrange(1, queries[-1]):
x = (1-a)*x
if i in y:
x += a
if i == queries[0]:
ewma.append(x)
queries.pop(0)
When it's done, ewma should have the moving averages for each query point.
Edited to include SchighSchagh's improvements.