Optimize non-trivial function on tensors

Optimize non-trivial function on tensors - python

I am looking for a way to speed up the specific operation on tensors in PyTorch. Since it is a general operation on matrices, I am open to answers in NumPy as well.
Let's say I have a tensor with values from 0 to N-1 (N=4) where each value repeats the same number of times (R=2).
import torch
x = torch.Tensor([0, 0, 1, 1, 2, 2, 3, 3])
In this case, it is sorted, but any permutation of x is also in the set of considered tensors X.
I am getting an input tensor with values from 0 to N-1 but without any constraints on the repetition.
z = torch.tensor([3, 2, 3, 0, 2, 3, 1, 2])
And I would like to find an efficient implementation of foo such that y = foo(z). y should be some permutation of x (from the set X) that tries to do as few changes in z as possible (in terms of Hamming distance), for example
y = torch.tensor([3, 2, 3, 0, 2, 0, 1, 1])
The trivial solution is to keep counting the number elements with the same value, but it is extremely inefficient to process elements one-by-one for larger tensors:
def foo(z):
R = 2
N = 4
counters = [0] * N
# first, we replace extra elements with -1
y = []
for elem in z:
if counters[elem] < R:
counters[elem] += 1
y.append(elem)
else:
y.append(-1)
y = torch.tensor(y)
assert torch.equal(y, torch.tensor([3, 2, 3, 0, 2, -1, 1, -1]))
# second, we replace -1 by "unfilled" counters
for i in range(len(y)):
if y[i] == -1:
first_unfilled = [n for n in range(N) if counters[n] < R][0]
counters[first_unfilled] += 1
y[i] = first_unfilled
return y
assert torch.equal(y, foo(z))

Related

Pythonic way of finding indexes of unique elements in two arrays

I have two sorted, numpy arrays similar to these ones:
x = np.array([1, 2, 8, 11, 15])
y = np.array([1, 8, 15, 17, 20, 21])
Elements never repeat in the same array. I want to figure out a way of pythonicaly figuring out a list of indexes that contain the locations in the arrays at which the same element exists.
For instance, 1 exists in x and y at index 0. Element 2 in x doesn't exist in y, so I don't care about that item. However, 8 does exist in both arrays - in index 2 in x but index 1 in y. Similarly, 15 exists in both, in index 4 in x, but index 2 in y. So the outcome of my function would be a list that in this case returns [[0, 0], [2, 1], [4, 2]].
So far what I'm doing is:
def get_indexes(x, y):
indexes = []
for i in range(len(x)):
# Find index where item x[i] is in y:
j = np.where(x[i] == y)[0]
# If it exists, save it:
if len(j) != 0:
indexes.append([i, j[0]])
return indexes
But the problem is that arrays x and y are very large (millions of items), so it takes quite a while. Is there a better pythonic way of doing this?

Without Python loops
Code
def get_indexes_darrylg(x, y):
' darrylg answer '
# Use intersect to find common elements between two arrays
overlap = np.intersect1d(x, y)
# Indexes of common elements in each array
loc1 = np.searchsorted(x, overlap)
loc2 = np.searchsorted(y, overlap)
# Result is the zip two 1d numpy arrays into 2d array
return np.dstack((loc1, loc2))[0]
Usage
x = np.array([1, 2, 8, 11, 15])
y = np.array([1, 8, 15, 17, 20, 21])
result = get_indexes_darrylg(x, y)
# result[0]: array([[0, 0],
[2, 1],
[4, 2]], dtype=int64)
Timing Posted Solutions
Results show that darrlg code has the fastest run time.
Code Adjustment
Each posted solution as a function.
Slight mod so that each solution outputs an numpy array.
Curve named after poster
Code
import numpy as np
import perfplot
def create_arr(n):
' Creates pair of 1d numpy arrays with half the elements equal '
max_val = 100000 # One more than largest value in output arrays
arr1 = np.random.randint(0, max_val, (n,))
arr2 = arr1.copy()
# Change half the elements in arr2
all_indexes = np.arange(0, n, dtype=int)
indexes = np.random.choice(all_indexes, size = n//2, replace = False) # locations to make changes
np.put(arr2, indexes, np.random.randint(0, max_val, (n//2, ))) # assign new random values at change locations
arr1 = np.sort(arr1)
arr2 = np.sort(arr2)
return (arr1, arr2)
def get_indexes_lllrnr101(x,y):
' lllrnr101 answer '
ans = []
i=0
j=0
while (i<len(x) and j<len(y)):
if x[i] == y[j]:
ans.append([i,j])
i += 1
j += 1
elif (x[i]<y[j]):
i += 1
else:
j += 1
return np.array(ans)
def get_indexes_joostblack(x, y):
'joostblack'
indexes = []
for idx,val in enumerate(x):
idy = np.searchsorted(y,val)
try:
if y[idy]==val:
indexes.append([idx,idy])
except IndexError:
continue # ignore index errors
return np.array(indexes)
def get_indexes_mustafa(x, y):
indices_in_x = np.flatnonzero(np.isin(x, y)) # array([0, 2, 4])
indices_in_y = np.flatnonzero(np.isin(y, x[indices_in_x])) # array([0, 1, 2]
return np.array(list(zip(indices_in_x, indices_in_y)))
def get_indexes_darrylg(x, y):
' darrylg answer '
# Use intersect to find common elements between two arrays
overlap = np.intersect1d(x, y)
# Indexes of common elements in each array
loc1 = np.searchsorted(x, overlap)
loc2 = np.searchsorted(y, overlap)
# Result is the zip two 1d numpy arrays into 2d array
return np.dstack((loc1, loc2))[0]
def get_indexes_akopcz(x, y):
' akopcz answer '
return np.array([
[i, j]
for i, nr in enumerate(x)
for j in np.where(nr == y)[0]
])
perfplot.show(
setup = create_arr, # tuple of two 1D random arrays
kernels=[
lambda a: get_indexes_lllrnr101(*a),
lambda a: get_indexes_joostblack(*a),
lambda a: get_indexes_mustafa(*a),
lambda a: get_indexes_darrylg(*a),
lambda a: get_indexes_akopcz(*a),
],
labels=["lllrnr101", "joostblack", "mustafa", "darrylg", "akopcz"],
n_range=[2 ** k for k in range(5, 21)],
xlabel="Array Length",
# More optional arguments with their default values:
# logx="auto", # set to True or False to force scaling
# logy="auto",
equality_check=None, #np.allclose, # set to None to disable "correctness" assertion
# show_progress=True,
# target_time_per_measurement=1.0,
# time_unit="s", # set to one of ("auto", "s", "ms", "us", or "ns") to force plot units
# relative_to=1, # plot the timings relative to one of the measurements
# flops=lambda n: 3*n, # FLOPS plots
)

What you are doing is O(nlogn) which is decent enough.
If you want, you can do it in O(n) by iterating on both arrays with two pointers and since they are sorted, increase the pointer for the array with smaller object.
See below:
x = [1, 2, 8, 11, 15]
y = [1, 8, 15, 17, 20, 21]
def get_indexes(x,y):
ans = []
i=0
j=0
while (i<len(x) and j<len(y)):
if x[i] == y[j]:
ans.append([i,j])
i += 1
j += 1
elif (x[i]<y[j]):
i += 1
else:
j += 1
return ans
print(get_indexes(x,y))
which gives me:
[[0, 0], [2, 1], [4, 2]]

Although, this function will search for all the occurances of x[i] in the y array, if duplicates are not allowed in y it will find x[i] exactly once.
def get_indexes(x, y):
return [
[i, j]
for i, nr in enumerate(x)
for j in np.where(nr == y)[0]
]

You can use numpy.searchsorted:
def get_indexes(x, y):
indexes = []
for idx,val in enumerate(x):
idy = np.searchsorted(y,val)
if y[idy]==val:
indexes.append([idx,idy])
return indexes

One solution is to first look from x's side to see what values are included in y by getting their indices through np.isin and np.flatnonzero, and then use the same procedure from the other side; but instead of giving x entirely, we give only the (already found) intersected elements to gain time:
indices_in_x = np.flatnonzero(np.isin(x, y)) # array([0, 2, 4])
indices_in_y = np.flatnonzero(np.isin(y, x[indices_in_x])) # array([0, 1, 2])
Now you can zip them to get the result:
result = list(zip(indices_in_x, indices_in_y)) # [(0, 0), (2, 1), (4, 2)]

How to stretch specific items of numpy array with decrement?

Given boundary value k, is there a vectorized way to replace each number n with consecutive descending numbers from n-1 to k? For example, if k is 0 the I'd like to replace np.array([3,4,2,2,1,3,1]) with np.array([2,1,0,3,2,1,0,1,0,1,0,0,2,1,0,0]). Every item of input array is greater than k.
I have tried combination of np.repeat and np.cumsum but it seems evasive solution:
x = np.array([3,4,2,2,1,3,1])
y = np.repeat(x, x)
t = -np.ones(y.shape[0])
t[np.r_[0, np.cumsum(x)[:-1]]] = x-1
np.cumsum(t)
Is there any other way? I expect smth like inverse of np.add.reduceat that is able to broadcast integers to decreasing sequences instead of minimizing them.

Here's another way with array-assignment to skip the repeat part -
def func1(a):
l = a.sum()
out = np.full(l, -1, dtype=int)
out[0] = a[0]-1
idx = a.cumsum()[:-1]
out[idx] = a[1:]-1
return out.cumsum()
Benchmarking
# OP's soln
def OP(x):
y = np.repeat(x, x)
t = -np.ones(y.shape[0], dtype=int)
t[np.r_[0, np.cumsum(x)[:-1]]] = x-1
return np.cumsum(t)
Using benchit package (few benchmarking tools packaged together; disclaimer: I am its author) to benchmark proposed solutions.
import benchit
a = np.array([3,4,2,2,1,3,1])
in_ = [np.resize(a,n) for n in [10, 100, 1000, 10000]]
funcs = [OP, func1]
t = benchit.timings(funcs, in_)
t.plot(logx=True, save='timings.png')
Extend to take k as arg
def func1(a, k):
l = a.sum()+len(a)*(-k)
out = np.full(l, -1, dtype=int)
out[0] = a[0]-1
idx = (a-k).cumsum()[:-1]
out[idx] = a[1:]-1-k
return out.cumsum()
Sample run -
In [120]: a
Out[120]: array([3, 4, 2, 2, 1, 3, 1])
In [121]: func1(a, k=-1)
Out[121]:
array([ 2, 1, 0, -1, 3, 2, 1, 0, -1, 1, 0, -1, 1, 0, -1, 0, -1,
2, 1, 0, -1, 0, -1])

This is concise and probably ok for efficiency; I don't think apply is vectorized here, so you will be limited mostly be the number of elements in the original array (less so their value is my guess):
import pandas as pd
x = np.array([3,4,2,2,1,3,1])
values = pd.Series(x).apply(lambda val: np.arange(val-1,-1,-1)).values
output = np.concatenate(values)

Finding similar sub-sequences in a time series?

I have thousands of time series (24 dimensional data -- 1 dimension for each hour of the day). Out of these time series, I'm interested in a particular sub-sequence or pattern that looks like this:
I'm interested in sub-sequences that resemble the overall shape of the highlighted section -- that is, a sub-sequence with a sharp negative slope, followed by a period of several hours where the slope is relatively flat before finally ending with a sharp positive slope. I know the sub-sequences I'm interested in won't match each other exactly and most likely will be shifted in time, scaled differently, have longer/shorter periods where the slope is relatively flat, etc. but I would like to find a way to detect them all.
To do this, I have developed a simple Heuristic (based on my definition of the highlighted section) to quickly find some of the sub-sequences of interest. However, I was wondering if there was a more elegant way (in Python) to search thousands of time series for the sub-sequence I'm interested in (while taking into account things mentioned above -- differences in time, scale, etc.)?

Edit: a year later I cannot believe how much I overcomplicated flatline and slope detection; stumbling on the same question, I realized it's as simple as
idxs = np.where(x[1:] - x[:-1] == 0)
idxs = [i for idx in idxs for i in (idx, idx + 1)]
First line is implemented efficiently via np.diff(x); further, to e.g. detect slope > 5, use np.diff(x) > 5. The second line is since differencing tosses out right endpoints (e.g. diff([5,6,6,6,7]) = [1,0,0,1] -> idxs=[1,2], excludes 3,.
Functions below should do; code written with intuitive variable & method names, and should be self-explanatory with some readovers. The code is efficient and scalable.
Functionalities:
Specify min & max flatline length
Specify min & max slopes for left & right tails
Specify min & max average slopes for left & right tails, over multiple intervals
Example:
import numpy as np
import matplotlib.pyplot as plt
# Toy data
t = np.array([[ 5, 3, 3, 5, 3, 3, 3, 3, 3, 5, 5, 3, 3, 0, 4,
1, 1, -1, -1, 1, 1, 1, 1, -1, 1, 1, -1, 0, 3, 3,
5, 5, 3, 3, 3, 3, 3, 5, 7, 3, 3, 5]]).T
plt.plot(t)
plt.show()
# Get flatline indices
indices = get_flatline_indices(t, min_len=4, max_len=5)
plt.plot(t)
for idx in indices:
plt.plot(idx, t[idx], marker='o', color='r')
plt.show()
# Filter by edge slopes
lims_left = (-10, -2)
lims_right = (2, 10)
averaging_intervals = [1, 2, 3]
indices_filtered = filter_by_tail_slopes(indices, t, lims_left, lims_right,
averaging_intervals)
plt.plot(t)
for idx in indices_filtered:
plt.plot(idx, t[idx], marker='o', color='r')
plt.show()
def get_flatline_indices(sequence, min_len=2, max_len=6):
indices=[]
elem_idx = 0
max_elem_idx = len(sequence) - min_len
while elem_idx < max_elem_idx:
current_elem = sequence[elem_idx]
next_elem = sequence[elem_idx+1]
flatline_len = 0
if current_elem == next_elem:
while current_elem == next_elem:
flatline_len += 1
next_elem = sequence[elem_idx + flatline_len]
if flatline_len >= min_len:
if flatline_len > max_len:
flatline_len = max_len
trim_start = elem_idx
trim_end = trim_start + flatline_len
indices_to_append = [index for index in range(trim_start, trim_end)]
indices += indices_to_append
elem_idx += flatline_len
flatline_len = 0
else:
elem_idx += 1
return indices if not all([(entry == []) for entry in indices]) else []
def filter_by_tail_slopes(indices, data, lims_left, lims_right, averaging_intervals=1):
indices_filtered = []
indices_temp, tails_temp = [], []
got_left, got_right = False, False
for idx in indices:
slopes_left, slopes_right = _get_slopes(data, idx, averaging_intervals)
for tail_left, slope_left in enumerate(slopes_left):
if _valid_slope(slope_left, lims_left):
if got_left:
indices_temp = [] # discard prev if twice in a row
tails_temp = []
indices_temp.append(idx)
tails_temp.append(tail_left + 1)
got_left = True
if got_left:
for edge_right, slope_right in enumerate(slopes_right):
if _valid_slope(slope_right, lims_right):
if got_right:
indices_temp.pop(-1)
tails_temp.pop(-1)
indices_temp.append(idx)
tails_temp.append(edge_right + 1)
got_right = True
if got_left and got_right:
left_append = indices_temp[0] - tails_temp[0]
right_append = indices_temp[1] + tails_temp[1]
indices_filtered.append(_fill_range(left_append, right_append))
indices_temp = []
tails_temp = []
got_left, got_right = False, False
return indices_filtered
def _get_slopes(data, idx, averaging_intervals):
if type(averaging_intervals) == int:
averaging_intervals = [averaging_intervals]
slopes_left, slopes_right = [], []
for interval in averaging_intervals:
slopes_left += [(data[idx] - data[idx-interval]) / interval]
slopes_right += [(data[idx+interval] - data[idx]) / interval]
return slopes_left, slopes_right
def _valid_slope(slope, lims):
min_slope, max_slope = lims
return (slope >= min_slope) and (slope <= max_slope)
def _fill_range(_min, _max):
return [i for i in range(_min, _max + 1)]

Pythonic way to vectorize double summation

I'm attempting to convert a double summation formula into code, but can't figure out the correct matrix/vector representation of it.
The first summation is i to n, and the second is over j > i to n.
I'm guessing there is a much more efficient & pythonic way of writing this?
I resorted to nested for loops to just get it working but, as expected, it runs very slowly with a large dataset:
def wapc_denom(weights, vols):
x = []
y = []
for i, wi in enumerate(weights):
for j, wj in enumerate(weights):
if j > i:
x.append(wi * wj * vols[i] * vols[j])
y.append(np.sum(x))
return np.sum(y)
Edit:
Using guidance from smci's answer I think I have a potential solution:
def wapc_denom2(weights, vols):
return np.sum(np.tril(np.outer(weights, vols.T)**2, k=-1))

Assuming you want to count every term only once (for that you have to move the x = [] into the outer loop) one cheap way of computing the sum would be
Create mock data
weights = np.random.random(10)
vols = np.random.random(10)
Do the calculation
wv = weights * vols
result = (wv.sum()**2 - wv#wv) / 2
Check that it's the same
def wapc_denom(weights, vols):
y = []
for i, wi in enumerate(weights):
x = []
for j, wj in enumerate(weights):
if j > i:
x.append(wi * wj * vols[i] * vols[j])
y.append(np.sum(x))
return np.sum(y)
assert np.allclose(result, wapc_denom(weights, vols))
Why does it work?
What we are doing is compute the sum of the full matrix, subtract the diagonal and divide by two. This is cheap because it is easy to verify that the sum of an outer product is just the product of the summed factors.

wi * wj * vols[i] * vols[j] is a telltale. vols is another vector, so first you want to compute the vector wv = w * vols
then (wj * vols[j]) * (wi * vols[i]) = wv^T * wv is your (matrix outer product) expression; that's a column vector * a row vector. But actually you only want the sum. So I don't see a need to construct a vector y.append(np.sum(x)), you're only going to sum it anyway np.sum(y)
also the if j > i part means you only want the sum of the Lower Triangular part, and exclude the diagonal.
EDIT: the result is fully determined just from wv, I didn't think we needed the matrix to get the sum, and we didn't need the diagonal; #PaulPanzer found the most compact expression.

You can use triangulations in numpy, check np.triu and np.meshgrid. Do:
np.product(np.triu(np.meshgrid(weights,weights), 1) * np.triu(np.meshgrid(vols,vols), 1),0).sum(1).cumsum().sum()
Example:
w = np.arange(4) +1
v = np.array([1,3,2,2])
print(np.triu(np.meshgrid(w,w), k=1))
>>array([[[0, 2, 3, 4],
[0, 0, 3, 4],
[0, 0, 0, 4],
[0, 0, 0, 0]],
[[0, 1, 1, 1],
[0, 0, 2, 2],
[0, 0, 0, 3],
[0, 0, 0, 0]]])
# example of product + triu + meshgrid (your x values):
print(np.product(np.triu(np.meshgrid(w,w), 1) * np.triu(np.meshgrid(v,v), 1),0))
>>array([[ 0, 6, 6, 8],
[ 0, 0, 36, 48],
[ 0, 0, 0, 48],
[ 0, 0, 0, 0]])
print(np.product(np.triu(np.meshgrid(w,w), 1) * np.triu(np.meshgrid(v,v), 1),0).sum(1).cumsum().sum())
>> 428
print(wapc_denom(w, v))
>> 428

Unable to print variables in Python when using def function

I am trying to implement a simple neural net. I want to print the initial pattern, weights, activation. I then want it to print the learning process (i.e. every pattern it goes through as it learns). I am as yet unable to do this - it returns the initial and final pattern (whn I put print p in appropriate places), but nothing else. Hints and tips appreciated - I'm a complete newbie to Python!
#!/usr/bin/python
import random
p = [ [1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1] ] # pattern I want the net to learn
n = 5
alpha = 0.01
activation = [] # unit activations
weights = [] # weights
output = [] # output
def initWeights(n): # set weights to zero, n is the number of units
global weights
weights = [[[0]*n]*n] # initialised to zero
def initNetwork(p): # initialises units to activation
global activation
activation = p
def updateNetwork(k): # pick unit at random and update k times
for l in range(k):
unit = random.randint(0,n-1)
activation[unit] = 0
for i in range(n):
activation[unit] += output[i] * weights[unit][i]
output[unit] = 1 if activation[unit] > 0 else -1
def learn(p):
for i in range(n):
for j in range(n):
weights += alpha * p[i] * p[j]

You have a problem with the line:
weights = [[[0]*n]*n]
When you use*, you multiply object references. You are using the same n-len array of zeroes every time. This will cause:
>>> weights[0][1][0] = 8
>>> weights
[[[8, 0, 0], [8, 0, 0], [8, 0, 0]]]
The first item of all the sublists is 8, because they are one and the same list. You stored the same reference multiple times, and so modifying the n-th item on any of them will alter all of them.

this the line is where you get :
"IndexError: list index out of range"
output[unit] = 1 if activation[unit] > 0 else -1
because output = [] , you should do output.append() or ...

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Optimize non-trivial function on tensors - python

Related

Pythonic way of finding indexes of unique elements in two arrays

How to stretch specific items of numpy array with decrement?

Finding similar sub-sequences in a time series?

Pythonic way to vectorize double summation

Unable to print variables in Python when using def function

Categories

Resources