I have an n by 3 numpy array, with each row representing a vector in 3D space.
I'd like to have a 3D n by n by 3 array with the vector at [i, j] representing the differences between each component of vectors i and j.
>>> x = np.random.randint(10, size=(4, 3))
>>> x
array([[4, 0, 8],
[8, 5, 3],
[4, 1, 6],
[2, 2, 4]])
>>> x[:, np.newaxis] - x
array([[[ 0, 0, 0],
[-4, -5, 5],
[ 0, -1, 2],
[ 2, -2, 4]],
[[ 4, 5, -5],
[ 0, 0, 0],
[ 4, 4, -3],
[ 6, 3, -1]],
[[ 0, 1, -2],
[-4, -4, 3],
[ 0, 0, 0],
[ 2, -1, 2]],
[[-2, 2, -4],
[-6, -3, 1],
[-2, 1, -2],
[ 0, 0, 0]]])
This works, but is the slowest line in my program by far...slower even than my Euclidean distance code (which takes advantage of dot products and einsum...)
Are there any clever matrix math methods for doing what I want in a more efficient way?
Related
Say i have a matrix that is composed of N vectors as columns: matrix=[v_1, v_2, .. v_N] where v is from R^N. I want a new matrix that gives: for all (r in N, s in N) v(r)-v(s). For example if N=3, then i want a 3x3x3 and for a particular index i,j,k, it Represents the kth index of v(i)-v(j).
You can use broadcasting on a new dimension:
out = matrix[:,None]-matrix
Example:
matrix = np.arange(9).reshape(3,3)
out = matrix[:,None]-matrix
Output:
array([[[ 0, 0, 0],
[-3, -3, -3],
[-6, -6, -6]],
[[ 3, 3, 3],
[ 0, 0, 0],
[-3, -3, -3]],
[[ 6, 6, 6],
[ 3, 3, 3],
[ 0, 0, 0]]])
I am using the matrix for the multiple sequence alignment and this is my score matrix which I got by running the alignment algorithm.
My matrix:
[
[0, 24, -5, 3, -3, -5],
[0, -4, 8, 1, 1],
[0, 13, 1, 2],
[0, -2, 5],
[0, 4],
[0]
]
Matrix I want to build:
[
[0, 24, -5, 3, -3, -5],
[24, 0, -4, 8, 1, 1],
[-5, -4, 0, 13, 1, 2],
[3, 8, 13, 0, -2, 5],
[-3, 1, 1, 2, 0, 4],
[-5, 1, 2, 5, 4, 0]
]
I am trying to create a symmetric matrix from the output I got in python without using NumPy and additional library. I have tried to implement using NumPy but I want to implement without using NumPy.
Try the following:
upper = [[0, 24, -5, 3, -3, -5], [0, -4, 8, 1, 1], [0, 13, 1, 2], [0, -2, 5], [0, 4], [0]]
n = len(upper) # 6: num of rows and cols (assuming square)
output = []
for i in range(n): # iterate over rows
row = [(upper[i][j - i] if j >= i else output[j][i]) for j in range(n)]
output.append(row)
print(output)
# [[0, 24, -5, 3, -3, -5], [24, 0, -4, 8, 1, 1], [-5, -4, 0, 13, 1, 2], [3, 8, 13, 0, -2, 5], [-3, 1, 1, -2, 0, 4], [-5, 1, 2, 5, 4, 0]]
consider array's a and b
a = np.array([
[-1, 1, 5],
[-2, 3, 0]
])
b = np.array([
[1, 1, 0],
[0, 2, 3],
])
Looking at
d = a.T.dot(b)
d
array([[-1, -5, -6],
[ 1, 7, 9],
[ 5, 5, 0]])
d[0, 0] is -1. and is the sum of a[:, 0] * b[:, 0]. I'd like a 2x2 array of vectors where the [0, 0] position would be a[:, 0] * b[:, 0].
with the above a and b, I'd expect
d = np.array([[a[:, i] * b[:, j] for j in range(a.shape[1])] for i in range(b.shape[1])])
d
array([[[-1, 0],
[-1, -4],
[ 0, -6]],
[[ 1, 0],
[ 1, 6],
[ 0, 9]],
[[ 5, 0],
[ 5, 0],
[ 0, 0]]])
The sum of d along axis==2 should be the dot product a.T.dot(b)
d.sum(2)
array([[-1, -5, -6],
[ 1, 7, 9],
[ 5, 5, 0]])
Question
What is the most efficient way of getting d?
Here's one way:
In [219]: a
Out[219]:
array([[-1, 1, 5],
[-2, 3, 0]])
In [220]: b
Out[220]:
array([[1, 1, 0],
[0, 2, 3]])
In [221]: a.T[:,None,:] * b.T[None,:,:]
Out[221]:
array([[[-1, 0],
[-1, -4],
[ 0, -6]],
[[ 1, 0],
[ 1, 6],
[ 0, 9]],
[[ 5, 0],
[ 5, 0],
[ 0, 0]]])
Or...
In [231]: (a[:,None,:] * b[:,:,None]).T
Out[231]:
array([[[-1, 0],
[-1, -4],
[ 0, -6]],
[[ 1, 0],
[ 1, 6],
[ 0, 9]],
[[ 5, 0],
[ 5, 0],
[ 0, 0]]])
Most efficient one would be with broadcasting as shown in #Warren Weckesser's post as we are basically dealing with element-wise multiplication without any sum-reduction.
An alternative one with np.einsum would be like so -
np.einsum('ij,ik->jki',a,b)
for r in a.T:
print(np.multiply(r,b.T))
[[-1 0]
[-1 -4]
[ 0 -6]]
[[1 0]
[1 6]
[0 9]]
[[5 0]
[5 0]
[0 0]]
I know I can do np.subtract.outer(x, x). If x has shape (n,), then I end up with an array with shape (n, n). However, I have an x with shape (n, 3). I want to output something with shape (n, n, 3). How do I do this? Maybe np.einsum?
You can use broadcasting after extending the dimensions with None/np.newaxis to form a 3D array version of x and subtracting the original 2D array version from it, like so -
x[:, np.newaxis, :] - x
Sample run -
In [6]: x
Out[6]:
array([[6, 5, 3],
[4, 3, 5],
[0, 6, 7],
[8, 4, 1]])
In [7]: x[:,None,:] - x
Out[7]:
array([[[ 0, 0, 0],
[ 2, 2, -2],
[ 6, -1, -4],
[-2, 1, 2]],
[[-2, -2, 2],
[ 0, 0, 0],
[ 4, -3, -2],
[-4, -1, 4]],
[[-6, 1, 4],
[-4, 3, 2],
[ 0, 0, 0],
[-8, 2, 6]],
[[ 2, -1, -2],
[ 4, 1, -4],
[ 8, -2, -6],
[ 0, 0, 0]]])
I generated a lower triangular matrix, and I want to complete the matrix using the values in the lower triangular matrix to form a square matrix, symmetrical around the diagonal zeros.
lower_triangle = numpy.array([
[0,0,0,0],
[1,0,0,0],
[2,3,0,0],
[4,5,6,0]])
I want to generate the following complete matrix, maintaining the zero diagonal:
complete_matrix = numpy.array([
[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
Thanks.
You can simply add it to its transpose:
>>> m
array([[0, 0, 0, 0],
[1, 0, 0, 0],
[2, 3, 0, 0],
[4, 5, 6, 0]])
>>> m + m.T
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
You can use the numpy.triu_indices or numpy.tril_indices:
>>> a=np.array([[0, 0, 0, 0],
... [1, 0, 0, 0],
... [2, 3, 0, 0],
... [4, 5, 6, 0]])
>>> irows,icols = np.triu_indices(len(a),1)
>>> a[irows,icols]=a[icols,irows]
>>> a
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])