Pairwise difference of vectors - python

Say i have a matrix that is composed of N vectors as columns: matrix=[v_1, v_2, .. v_N] where v is from R^N. I want a new matrix that gives: for all (r in N, s in N) v(r)-v(s). For example if N=3, then i want a 3x3x3 and for a particular index i,j,k, it Represents the kth index of v(i)-v(j).

You can use broadcasting on a new dimension:
out = matrix[:,None]-matrix
Example:
matrix = np.arange(9).reshape(3,3)
out = matrix[:,None]-matrix
Output:
array([[[ 0, 0, 0],
[-3, -3, -3],
[-6, -6, -6]],
[[ 3, 3, 3],
[ 0, 0, 0],
[-3, -3, -3]],
[[ 6, 6, 6],
[ 3, 3, 3],
[ 0, 0, 0]]])

Related

How to do indexing of a NumPy 3D-array based on 2D-array in Python?

Let say I have a NumPy array A of shape (66,5) and B of shape (100, 66, 5).
The elements of A will index the first dimension (axis=0) of B, where the values are from 0 to 99 (i.e. the first dimension of B is 100).
A =
array([[ 1, 0, 0, 1, 0],
[ 0, 2, 0, 2, 4],
[ 1, 7, 0, 5, 5],
[ 2, 1, 0, 1, 7],
[ 0, 7, 0, 1, 4],
[ 0, 0, 3, 6, 0]
.... ]])
For example, A[4,1] will take index 7 of the first dimension of B, index 4 of the second dimension of B and index 1 of the third dimension B.
What I wanted to is to produce array C of shape (66,5) where it contains the elements in B that are selected based on the elements in A.
You can use np.take_along_axis to do that:
import numpy as np
np.random.seed(0)
a = np.random.randint(100, size=(66, 5))
b = np.random.random(size=(100, 66, 5))
c = np.take_along_axis(b, a[np.newaxis], axis=0)[0]
# Test some element
print(c[25, 3] == b[a[25, 3], 25, 3])
# True
If I understand correctly, you are looking for advances indexing of first dimension of B. You can use np.indices to create the indices required for the other two dimensions of B and use advanced indexing:
idx = np.indices(A.shape)
C = B[A,idx[0],idx[1]]
Example:
B = np.random.rand(10,20,30)
A = np.array([[ 1, 0, 0, 1, 0],
[ 0, 2, 0, 2, 4],
[ 1, 7, 0, 5, 5],
[ 2, 1, 0, 1, 7],
[ 0, 7, 0, 1, 4],
[ 0, 0, 3, 6, 0]])
print(C[4,1]==B[7,4,1])
#True
Use the following (using functions of NumPy library):
print(A)
# array([[2, 0],
# [1, 1],
# [2, 0]])
print(B)
# array([[[ 5, 7],
# [ 0, 0],
# [ 0, 0]],
# [[ 1, 8],
# [ 1, 9],
# [10, 1]],
# [[12, 22],
# [ 2, 2],
# [ 2, 2]]])
temp = A.reshape(-1) + np.cumsum(np.ones([A.reshape(-1).shape[0]])*B.shape[0], dtype = 'int') - 3
C = B.swapaxes(0, 1).swapaxes(2, 1).reshape(-1)[temp].reshape(A.shape)
print(C)
# array([[12, 7],
# [ 1, 9],
# [ 2, 0]])

How two rows can be swapped in a torch tensor?

var = [[0, 1, -4, 8],
[2, -3, 2, 1],
[5, -8, 7, 1]]
var = torch.Tensor(var)
Here, var is a 3 x 4 (2d) tensor. How the first and second row can be swapped to get the following 2d tensor?
2, -3, 2, 1
0, 1, -4, 8
5, -8, 7, 1
The other answer does not work, as some dimensions get overwritten before they are copied:
>>> var = [[0, 1, -4, 8],
[2, -3, 2, 1],
[5, -8, 7, 1]]
>>> x = torch.tensor(var)
>>> index = torch.LongTensor([1, 0, 2])
>>> x[index] = x
>>> x
tensor([[ 0, 1, -4, 8],
[ 0, 1, -4, 8],
[ 5, -8, 7, 1]])
For me, it suffices to create a new tensor (with separate underlying storage) to hold the result:
>>> x = torch.tensor(var)
>>> index = torch.LongTensor([1, 0, 2])
>>> y = torch.zeros_like(x)
>>> y[index] = x
Alternatively, you can use index_copy_ (following this explanation in discuss.pytorch.org), although I don't see an advantage for either way at the moment.
As other answers suggested that your permutation index should be a tensor itself, but it is not necessary. You can swap 1st and 2nd row like this:
>>> var
tensor([[ 0, 1, -4, 8],
[ 2, -3, 2, 1],
[ 5, -8, 7, 1]])
>>> var[[0, 1]] = var[[1, 0]]
>>> var
tensor([[ 2, -3, 2, 1],
[ 0, 1, -4, 8],
[ 5, -8, 7, 1]])
var can be a NumPy array or PyTorch tensor.
You can use index_select for this:
>>> idx = torch.LongTensor([1,0,2])
>>> var.index_select(0, idx)
tensor([[ 2, -3, 2, 1],
[ 0, 1, -4, 8],
[ 5, -8, 7, 1]])
Generate the permutation index you desire:
index = torch.LongTensor([1,0,2])
Apply the permutation:
var[index] = var

Fastest Pairwise Difference of Rows

I have an n by 3 numpy array, with each row representing a vector in 3D space.
I'd like to have a 3D n by n by 3 array with the vector at [i, j] representing the differences between each component of vectors i and j.
>>> x = np.random.randint(10, size=(4, 3))
>>> x
array([[4, 0, 8],
[8, 5, 3],
[4, 1, 6],
[2, 2, 4]])
>>> x[:, np.newaxis] - x
array([[[ 0, 0, 0],
[-4, -5, 5],
[ 0, -1, 2],
[ 2, -2, 4]],
[[ 4, 5, -5],
[ 0, 0, 0],
[ 4, 4, -3],
[ 6, 3, -1]],
[[ 0, 1, -2],
[-4, -4, 3],
[ 0, 0, 0],
[ 2, -1, 2]],
[[-2, 2, -4],
[-6, -3, 1],
[-2, 1, -2],
[ 0, 0, 0]]])
This works, but is the slowest line in my program by far...slower even than my Euclidean distance code (which takes advantage of dot products and einsum...)
Are there any clever matrix math methods for doing what I want in a more efficient way?

How do I get the dot product but without the summation

consider array's a and b
a = np.array([
[-1, 1, 5],
[-2, 3, 0]
])
b = np.array([
[1, 1, 0],
[0, 2, 3],
])
Looking at
d = a.T.dot(b)
d
array([[-1, -5, -6],
[ 1, 7, 9],
[ 5, 5, 0]])
d[0, 0] is -1. and is the sum of a[:, 0] * b[:, 0]. I'd like a 2x2 array of vectors where the [0, 0] position would be a[:, 0] * b[:, 0].
with the above a and b, I'd expect
d = np.array([[a[:, i] * b[:, j] for j in range(a.shape[1])] for i in range(b.shape[1])])
d
array([[[-1, 0],
[-1, -4],
[ 0, -6]],
[[ 1, 0],
[ 1, 6],
[ 0, 9]],
[[ 5, 0],
[ 5, 0],
[ 0, 0]]])
The sum of d along axis==2 should be the dot product a.T.dot(b)
d.sum(2)
array([[-1, -5, -6],
[ 1, 7, 9],
[ 5, 5, 0]])
Question
What is the most efficient way of getting d?
Here's one way:
In [219]: a
Out[219]:
array([[-1, 1, 5],
[-2, 3, 0]])
In [220]: b
Out[220]:
array([[1, 1, 0],
[0, 2, 3]])
In [221]: a.T[:,None,:] * b.T[None,:,:]
Out[221]:
array([[[-1, 0],
[-1, -4],
[ 0, -6]],
[[ 1, 0],
[ 1, 6],
[ 0, 9]],
[[ 5, 0],
[ 5, 0],
[ 0, 0]]])
Or...
In [231]: (a[:,None,:] * b[:,:,None]).T
Out[231]:
array([[[-1, 0],
[-1, -4],
[ 0, -6]],
[[ 1, 0],
[ 1, 6],
[ 0, 9]],
[[ 5, 0],
[ 5, 0],
[ 0, 0]]])
Most efficient one would be with broadcasting as shown in #Warren Weckesser's post as we are basically dealing with element-wise multiplication without any sum-reduction.
An alternative one with np.einsum would be like so -
np.einsum('ij,ik->jki',a,b)
for r in a.T:
print(np.multiply(r,b.T))
[[-1 0]
[-1 -4]
[ 0 -6]]
[[1 0]
[1 6]
[0 9]]
[[5 0]
[5 0]
[0 0]]

Numpy Matrix Difference row by row into 3D tensor [duplicate]

I know I can do np.subtract.outer(x, x). If x has shape (n,), then I end up with an array with shape (n, n). However, I have an x with shape (n, 3). I want to output something with shape (n, n, 3). How do I do this? Maybe np.einsum?
You can use broadcasting after extending the dimensions with None/np.newaxis to form a 3D array version of x and subtracting the original 2D array version from it, like so -
x[:, np.newaxis, :] - x
Sample run -
In [6]: x
Out[6]:
array([[6, 5, 3],
[4, 3, 5],
[0, 6, 7],
[8, 4, 1]])
In [7]: x[:,None,:] - x
Out[7]:
array([[[ 0, 0, 0],
[ 2, 2, -2],
[ 6, -1, -4],
[-2, 1, 2]],
[[-2, -2, 2],
[ 0, 0, 0],
[ 4, -3, -2],
[-4, -1, 4]],
[[-6, 1, 4],
[-4, 3, 2],
[ 0, 0, 0],
[-8, 2, 6]],
[[ 2, -1, -2],
[ 4, 1, -4],
[ 8, -2, -6],
[ 0, 0, 0]]])

Categories

Resources