3 dimensional matrix multiplication in tensorflow

3 dimensional matrix multiplication in tensorflow - python

For a 2-dimensional matrix A of size (N, K) with each element 'a', we can get a matrix B of size (N, K, N) with each element 'b' such that b[i, k, j] = a[i, k]*a[j,k] by the operation
B = tf.expand_dims(A, -1)* tf.transpose(A).
Now with a matrix of 3-dimensional matrix A of size (M, N, K) with each element 'a', is there a way to compute 4-dimensional matrix B of size (M, N, K, N) with each element 'b' such that
b[m, i, k, j] = a[m, i, k]*a[m, j, k]?

Try einsum:
B = np.einsum('mik,mjk->mikj', A, A)
You can use (tf.einsum) if you are using tensors.

Bemma,
This solution should work:
Expand N dimension, multiply, transpose result.
M, N, K = 2,3,4 # insert your dimensions here
A = tf.constant(np.random.randint(1, 100, size=[M,N,K])) # generate A
B = tf.expand_dims(A, 1)* tf.expand_dims(A, 2)
B = tf.transpose(B, perm=[0, 1, 3, 2])
# test to verify result:
for m in range (M):
for i in range (N):
for k in range (K):
for j in range (N):
assert B[m, i, k, j] == A[m, i, k] * A[m, j, k]
this test passes without errors

Related

vectorize the assignment of 3d numpy arrays conditioned on the associate values at other dimensions

Is it possible to vectorize the following code in Python? It runs very slowly when the size of the array becomes large.
import numpy as np
# A, B, C are 3d arrays with shape (K, N, N).
# Entries in A, B, and C are in [0, 1].
# In the following, I use random values in B and C as an example.
K = 5
N = 10000
A = np.zeros((K, N, N))
B = np.random.normal(0, 1, (K, N, N))
C = np.random.normal(0, 1, (K, N, N))
for k in range(K):
for m in [x for x in range(K) if x != k]:
for i in range(N):
for j in range(N):
if A[m, i, j] not in [0, 1]:
if A[k, i, j] == 1:
A[m, i, j] = B[m ,i ,j]
if A[k ,i, j] == 0:
A[m, i, j] = C[m, i, j]

I cannot identify a way to vectorize this, but I can suggest using numba package to reduce the computation time. At here, you can import njit with the nogil=True parameter to speed up your code.
from numba import njit
#njit(nogil=True)
def function():
for k in range(K):
for m in [x for x in range(K) if x != k]:
for i in range(N):
for j in range(N):
if A[k, i, j] == 1 and A[m, i, j] not in [0, 1]:
A[m, i, j] = B[m ,i ,j]
if A[k ,i, j] == 0 and A[m, i, j] not in [0, 1]:
A[m, i, j] = C[m, i, j]
%timeit function()
7.35 s ± 252 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
With njit and nogil parameter, it took me 7 seconds to run the whole thing, but without the njit, my code is running for hours(and it still is now). Python has a global interpreter lock (GIL) to make sure it sticks to single-threading. By releasing the GIL, you can execute the code in multithreading. However, when using nogil=True, you’ll have to be wary of the usual pitfalls of multi-threaded programming (consistency, synchronization, race conditions, etc.).
You can look at the documentation about Numba here.
https://numba.pydata.org/numba-doc/dev/user/jit.html?highlight=nogil

I can help with a partial vectorization that should speed things up quite a bit, but I'm not sure on your logic for k vs. m, so didn't try to include that part. Essentially, you create a mask with the conditions you want checked across the 2nd and 3rd dimensions of A. Then map between A and either B or C using the appropriate mask:
# A, B, C are 3d arrays with shape (K, N, N).
# Entries in A, B, and C are in [0, 1].
# In the following, I use random values in B and C as an example.
np.random.seed(10)
K = 5
N = 1000
A = np.zeros((K, N, N))
B = np.random.normal(0, 1, (K, N, N))
C = np.random.normal(0, 1, (K, N, N))
for k in range(K):
for m in [x for x in range(K) if x != k]:
#if A[m, i, j] not in [0, 1]:
mask_1 = A[k, :, :] == 1
mask_0 = A[k, :, :] == 0
A[m, mask_1] = B[m, mask_1]
A[m, mask_0] = C[m, mask_0]
I omitted the A[m, i, j] not in [0, 1] part because this made it difficult to debug since nothing happens (A is initialized as all zeros). If you need to include additional logic like this, just create another mask for it and include it in with an and in each mask's logic.
Update on 7/6/22
If you want to update the above code to remove the loop over m, then you can initialize an array with all the values of k, and use that to expand the mask to include all 3 dimensions, excluding each value of k that matches m as follows:
np.random.seed(10)
K = 5
N = 1000
A_2 = np.zeros((K, N, N))
B = np.random.normal(0, 1, (K, N, N))
C = np.random.normal(0, 1, (K, N, N))
K_vals = np.array(range(K))
for k in range(K):
#for m in [x for x in range(K) if x != k]:
#if A[m, i, j] not in [0, 1]:
k_dim_2_skip = K_vals == k
mask_1 = np.tile(A_2[k, :, :] == 1, (K, 1, 1))
mask_1[k_dim_2_skip, :, :] = False
mask_0 = np.tile(A_2[k, :, :] == 0, (K, 1, 1))
mask_0[k_dim_2_skip, :, :] = False
A_2[mask_1] = B[mask_1]
A_2[mask_0] = C[mask_0]
Use these masks with the & np.logical_not... code you added in the comment below and that should do it. Note the more you vectorize, the larger the arrays you're manipulating for masks, etc. get, so there is a tradeoff with memory consumption. There is usually a sweet spot to balance run time vs memory usage for a given problem.

How to vectorize with mismatched dimensionality

I have some constraints of the form of
A_{i,j,k} = r_{i,j}B_{i,j,k}
A is a nxmxp matrix, as is B. r is an nxm matrix.
I would like to vectorize this in Python somehow, as efficiently as possible. Right now, I am making r into nxmxp matrix by saying r_{i,j,k} = r_{i,j} for all 1 <= k <= p. Then I call np.multiply on r and B. This seems inefficient. Any ideas welcome, thanks.
def ndHadamardProduct(r, n, m, p): #r is a n x m matrix, p is an int
rnew = np.zeros(n, m, p)
B = np.zeros(n, m, p)
for i in range(n):
for j in range(m):
for k in range(p):
r[i, j, k] = r[i, j]
B[i, j, k] = random.uniform(0, 1)
return np.multiply(r, B)

Add an extra dimension with np.newaxis and then broadcasting takes care of the repetition for you.
import numpy as np
r = np.random.random((3,4))
b = np.random.random((3,4,5))
a = r[:,:,np.newaxis] * b

Efficiently computing multiple tensor inner products

I'm working with a k x k x k x k tensor (say S) and an array X of size (n, k). Roughly, X's rows correspond to node features for a graph. For each pair of edges (say e = (u, v) and e' = (u_, v_)) I want to compute a new element as follows:
elt = np.sum(S * np.multiply.outer(np.outer(X[u, :], X[v, :]), np.outer(X[u_, :], X[v_, :])))
I wonder if there is a way to do this more efficiently instead of 4 nested loops over indices.
If I was working with just pairs of nodes and S was just a k x k matrix, this could be written simply as
all_elts = X # S # X.T
However, I'm not sure how this generalizes over multiple dimensions. Any help is much appreciated!

Here is an example to show how to use einsum():
import numpy as np
from itertools import product
n = 4
x = np.random.randn(n, n)
S = np.random.randn(n, n, n, n)
res = np.zeros((n, n, n, n))
for i, j, k, l in product(range(n), range(n), range(n), range(n)):
res[i, j, k, l] = np.sum(S * np.multiply.outer(np.outer(x[i, :], x[j, :]), np.outer(x[k, :], x[l, :])))
res2 = np.einsum("efgh,ae,bf,cg,dh->abcd", S, x, x, x, x)
np.allclose(res, res2)

Wiki example for Arnoldi iteration only works for real matrices?

The Wikipedia entry for the Arnoldi method provides a Python example that produces basis of the Krylov subspace of a matrix A. Supposedly, if A is Hermitian (i.e. if A == A.conj().T) then the Hessenberg matrix h generated by this algorithm is tridiagonal (source). However, when I use the Wikipedia code on a real-world Hermitian matrix, the Hessenberg matrix is not at all tridiagonal. When I perform the computation on the real part of A (so that A == A.T) then I do get a tridiagonal Hessenberg matrix, so there seems to be a problem with the imaginary components of A. Does anybody know why the Wikipedia code doesn't produce the expected results?
Working example:
import numpy as np
import matplotlib.pyplot as plt
from scipy.linalg import circulant
def arnoldi_iteration(A, b, n):
m = A.shape[0]
h = np.zeros((n + 1, n), dtype=np.complex)
Q = np.zeros((m, n + 1), dtype=np.complex)
q = b / np.linalg.norm(b) # Normalize the input vector
Q[:, 0] = q # Use it as the first Krylov vector
for k in range(n):
v = A.dot(q) # Generate a new candidate vector
for j in range(k + 1): # Subtract the projections on previous vectors
h[j, k] = np.dot(Q[:, j], v)
v = v - h[j, k] * Q[:, j]
h[k + 1, k] = np.linalg.norm(v)
eps = 1e-12 # If v is shorter than this threshold it is the zero vector
if h[k + 1, k] > eps: # Add the produced vector to the list, unless
q = v / h[k + 1, k] # the zero vector is produced.
Q[:, k + 1] = q
else: # If that happens, stop iterating.
return Q, h
return Q, h
# Construct matrix A
N = 2**4
I = np.eye(N)
k = np.fft.fftfreq(N, 1.0 / N) + 0.5
alpha = np.linspace(0.1, 1.0, N)*2e2
c = np.fft.fft(alpha) / N
C = circulant(c)
A = np.einsum("i, ij, j->ij", k, C, k)
# Show that A is Hermitian
print(np.allclose(A, A.conj().T))
# Arbitrary (random) initial vector
np.random.seed(0)
v = np.random.rand(N)
# Perform Arnoldi iteration with complex A
_, h = arnoldi_iteration(A, v, N)
# Perform Arnoldi iteration with real A
_, h2 = arnoldi_iteration(np.real(A), v, N)
# Plot results
plt.subplot(121)
plt.imshow(np.abs(h))
plt.title("Complex A")
plt.subplot(122)
plt.imshow(np.abs(h2))
plt.title("Real A")
plt.tight_layout()
plt.show()
Result:

After browsing through some conference presentation slides, I realised that at some point Q had to be conjugated when A is complex. The correct algorithm is posted below for reference, with the code change marked (note that this correction has also been submitted to the Wikipedia entry):
import numpy as np
def arnoldi_iteration(A, b, n):
m = A.shape[0]
h = np.zeros((n + 1, n), dtype=np.complex)
Q = np.zeros((m, n + 1), dtype=np.complex)
q = b / np.linalg.norm(b)
Q[:, 0] = q
for k in range(n):
v = A.dot(q)
for j in range(k + 1):
h[j, k] = np.dot(Q[:, j].conj(), v) # <-- Q needs conjugation!
v = v - h[j, k] * Q[:, j]
h[k + 1, k] = np.linalg.norm(v)
eps = 1e-12
if h[k + 1, k] > eps:
q = v / h[k + 1, k]
Q[:, k + 1] = q
else:
return Q, h
return Q, h

Python implementation of statistical Sweep operator

I am learning some techniques for doing statistics with missing data from a book (Statistical Analysis with Missing Data by Little and Rubin). One particularly useful function for working with monotone non-response data is the Sweep Operator (details on page 148-151). I know that the R module gmm has the swp function which does this but I was wondering if anyone has implemented this function in Python, ideally for Numpy matrices to hold the input data. I searched StackOverflow and also did several web searches without success. Thanks for any help.
Here is the definition.
A PxP symmetric matrix G is said to be swept on row and column k if it is replaced by another symmetric PxP matrix H with elements defined as follows:
h_kk = -1/g_kk
h_jk = h_kj = g_jk/g_kk for j != k
h_jl = g_jl - g_jk g_kl / g_kk j != k, l != k
G = [g11, g12, g13
g12, g22, g23
g13, g23, g33]
H = SWP(1,G) = [-1/g11, g12/g11, g13/g11
g12/g11, g22-g12^2/g11, g23-g13*g12/g11
g13/g11, g23-g13*g12/g11, g33-g13^2/g11]
kvec = [k1,k2,k3]
SWP[kvec,G] = SWP(k1,SWP(k2,SWP(k3,G)))
Inverse function
H = RSW(k,G)
h_kk = -1/g_kk
h_jk = h_kj = -g_jk/g_kk for j != k
h_jl = g_jk g_kl / g_kk j != k, l != k
G == SWP(k,RSW(k,G)) == RSW(k,SWP(k,G))

def sweep(g, k):
g = np.asarray(g)
n = g.shape[0]
if g.shape != (n, n):
raise ValueError('Not a square array')
if not np.allclose(g - g.T, 0):
raise ValueError('Not a symmetrical array')
if k >= n:
raise ValueError('Not a valid row number')
# Fill with the general formula
h = g - np.outer(g[:, k], g[k, :]) / g[k, k]
# h = g - g[:, k:k+1] * g[k, :] / g[k, k]
# Modify the k-th row and column
h[:, k] = g[:, k] / g[k, k]
h[k, :] = h[:, k]
# Modify the pivot
h[k, k] = -1 / g[k, k]
return h
I have no way of testing the above code, but I found an alternativee description here, which is valid for non-symmetrical matrices, which can be calculated as follows:
def sweep_non_sym(a, k):
a = np.asarray(a)
n = a.shape[0]
if a.shape != (n, n):
raise ValueError('Not a square array')
if k >= n:
raise ValueError('Not a valid row number')
# Fill with the general formula
b = a - np.outer(a[:, k], a[k, :]) / a[k, k]
# b = a - a[:, k:k+1] * a[k, :] / a[k, k]
# Modify the k-th row and column
b[k, :] = a[k, :] / a[k, k]
b[:, k] = -a[:, k] / a[k, k]
# Modify the pivot
b[k, k] = 1 / a[k, k]
return b
This one does give the correct results for the examples in that link:
>>> a = [[2,4],[3,1]]
>>> sweep_non_sym(a, 0)
array([[ 0.5, 2. ],
[-1.5, -5. ]])
>>> sweep_non_sym(sweep_non_sym(a, 0), 1)
array([[-0.1, 0.4],
[ 0.3, -0.2]])
>>> np.dot(a, sweep_non_sym(sweep_non_sym(a, 0), 1))
array([[ 1.00000000e+00, 0.00000000e+00],
[ 5.55111512e-17, 1.00000000e+00]])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

3 dimensional matrix multiplication in tensorflow - python

Try einsum: B = np.einsum('mik,mjk->mikj', A, A) You can use (tf.einsum) if you are using tensors.

Related

vectorize the assignment of 3d numpy arrays conditioned on the associate values at other dimensions

How to vectorize with mismatched dimensionality

Efficiently computing multiple tensor inner products

Wiki example for Arnoldi iteration only works for real matrices?

Python implementation of statistical Sweep operator

Categories

Resources