I have a rather quick question on tensordot operation. I'm trying to figure out if there is a way to perform a tensordot product between two tensors to get the right output of shape that I want. One of the tensors is B X L X D dimensions and the other one is B X 1 X D dimensions and I'm trying to figure out if it's possible to end up with B X D matrix at the end.
Currently I'm looping through the B dimension and performing a matrix multiplication between 1 X D and D X L (transposing L X D) matrices and stacking them to end up with B X L matrix at the end. This is obviously not the fastest way possible as a loop can be expensive. Would it be possible to get the desired output of B X D shape by performing a quick tensordot? I cannot seem to figure out a way to get rid of 1 of the B's.
Any insight or direction would be very much appreciated.
One option
Is to use torch.bmm() which does exactly that (docs).
It takes tensors of shape (b, n, m) and (b, m, p) and returns the batch matrix multiplication of shape (b, n, p).
(I assume you ment a result of B X L since the matrix multiplication of 1 X D and D X L is of shape 1 X L and not 1 X D).
In your case:
import torch
B, L, D = 32, 10, 512
a = torch.randn(B, 1, D) #shape (B X 1 X D)
b = torch.randn(B, L, D) #shape (B X L X D)
b = b.transpose(1,2) #shape (B X D X L)
result = torch.bmm(a, b)
result = result.squeeze()
print(result.shape)
>>> torch.Size([32, 10])
Alternatively
You can use torch.einsum(), which is more compact but less readable in my opinion:
import torch
B, L, D = 32, 10, 512
a = torch.randn(B, 1, D)
b = torch.randn(B, L, D)
result = torch.einsum('abc, adc->ad', a, b)
print(result.shape)
>>> torch.Size([32, 10])
The squeeze at the end is in order to make your result of shape (32, 10) instead of shape (32, 1, 10).
I believe torch.einsum to be the most intuitive way to perform tensor summations:
>>> torch.einsum('bld,bed->bd', x, y)
Which will have a shape of (B, D).
Formulated explicitly, the operation performed here is equivalent to:
res = torch.zeros(B, D)
for b in range(B):
for l in range(L):
for d in range(D):
res += x[b,l,d]*y[b,0,d]
Actually the second axis on y is also looped over, but the range is just [0], since y's 2nd dimension is a singleton.
Related
Given 2 tensors 2-D in PyTorch A (a X m) and B (m X b), is there any efficient way to obtain a tensor C (m X a X b), where C[i,:,:] = A[:,i] # B[i,:]?
Here I will give an example of the problem:
A = torch.FloatTensor([[1,2],[3,4]])
B = torch.FloatTensor([[1,2,3],[4,5,6]])
Result:
C = torch.FloatTensor([[[1,2,3],[3,6,9]],[[12,15,18],[16,20,24]]])
I have done it using a for-loop. However, it is very inefficient.
look at torch.einsum:
C = torch.einsum('im,mj->mij', A, B)
I have two 3D arrays A and B with shapes (k, n, n) and (k, m, m) respectively. I would like to create a matrix C of shape (k, n+m, n+m) such that for each 0 <= i < k, the 2D matrix C[i,:,:] is the block diagonal matrix obtained by putting A[i, :, :] at the upper left n x n part and B[i, :, :] at the lower right m x m part.
Currently I am using the following to achieve this is NumPy:
C = np.empty((k, n+m, n+m))
for i in range(k):
C[i, ...] = np.block([[A[i,...], np.zeros((n,m))],
[np.zeros((m,n)), B[i,...]]])
I was wondering if there is a way to do this without the for loop. I think if k is large my solution is not very efficient.
IIUC You can simply slice and assign -
C = np.zeros((k, n+m, n+m),dtype=np.result_type(A,B))
C[:,:n,:n] = A
C[:,n:,n:] = B
Suppose I have two 2D NumPy arrays A and B, I would like to compute the matrix C whose entries are C[i, j] = f(A[i, :], B[:, j]), where f is some function that takes two 1D arrays and returns a number.
For instance, if def f(x, y): return np.sum(x * y) then I would simply have C = np.dot(A, B). However, for a general function f, are there NumPy/SciPy utilities I could exploit that are more efficient than doing a double for-loop?
For example, take def f(x, y): return np.sum(x != y) / len(x), where x and y are not simply 0/1-bit vectors.
Here is a reasonably general approach using broadcasting.
First, reshape your two matrices to be rank-four tensors.
A = A.reshape(A.shape + (1, 1))
B = B.reshape((1, 1) + B.shape)
Second, apply your function element by element without performing any reduction.
C = f(A, B) # e.g. A != B
Having reshaped your matrices allows numpy to broadcast. The resulting tensor C has shape A.shape + B.shape.
Third, apply any desired reduction by, for example, summing over the indices you want to discard:
C = C.sum(axis=(1, 3)) / C.shape[0]
I have a bunch of 3x2 matrices, let's say 777 of them, and just as many right-hand sides of size 3. For each of them, I would like to know the least squared solution, so I'm doing
import numpy
A = numpy.random.rand(3, 2, 777)
b = numpy.random.rand(3, 777)
for k in range(777):
numpy.linalg.lstsq(A[..., k], b[..., k])
That works, but is slow. I'd much rather compute all the solutions in one go, but upon
numpy.linalg.lstsq(A, b)
I'm getting
numpy.linalg.linalg.LinAlgError: 3-dimensional array given. Array must be two-dimensional
Any hints on how to broadcast numpy.linalg.lstsq?
One can make use of the fact that if A = U \Sigma V^T is the singular value decomposition of A,
x = V \Sigma^+ U^T b
is the least-squares solution to Ax = b. SVD is broadcasted in numpy. It now only requires a bit of fiddling with einsums to get it all right:
A = numpy.random.rand(7, 3, 2)
b = numpy.random.rand(7, 3)
for k in range(7):
x, res, rank, sigma = numpy.linalg.lstsq(A[k], b[k])
print(x)
print
u, s, v = numpy.linalg.svd(A, full_matrices=False)
uTb = numpy.einsum('ijk,ij->ik', u, b)
xx = numpy.einsum('ijk, ij->ik', v, uTb / s)
print(xx)
I have some large arrays each with i elements, call them X, Y, Z, for which I need to find some values a, b--where a and b are real numbers between 0 and 1--such that, for the following functions,
r = X - a*Y - b*Z
r_av = Sum(r)/i
rms = Sum((r - r_av)^2), summing over the i pixels
I want to minimize the rms. Basically I'm looking to minimize the scatter in r, and thus need to find the right a and b to do that. So far I have thought to do this in nested loops in one of two ways: either 1)just looping through a range of possible a,b and then selecting out the smallest rms, or 2)inserting a while statement so that the loop will terminate once rms stops decreasing with decreasing a,b for instance. Here's some pseudocode for these:
1) List
for a = 1
for b = 1
calculate m
b = b - .001
a = a - .001
loop 1000 times
sort m values, from smallest
print (a,b) corresponding to smallest m
2) Terminate
for a = 1
for b = 1
calculate m
while m > previous step,
b = b - .001
a = a - .001
Is one of these preferable? Or is there yet another, better way to go about this? Any tips would be greatly appreciated.
There is already a handy formula for least squares fitting.
I came up with two different ways to solve your problem.
For the first one, consider the matrix K:
L = len(X)
K = np.identity(L) - np.ones((L, L)) / L
In your case, A and B are defined as:
A = K.dot(np.array([Y, Z]).transpose())
B = K.dot(np.array([X]).transpose())
Apply the formula to find C that minimizes the error A * C - B:
C = np.linalg.inv(np.transpose(A).dot(A))
C = C.dot(np.transpose(A)).dot(B)
Then the result is:
a, b = C.reshape(2)
Also, note that numpy already provides linalg.lstsq that does the exact same thing:
a, b = np.linalg.lstsq(A, B)[0].reshape(2)
A simpler way is to define A as:
A = np.array([Y, Z, [1]*len(X)]).transpose()
Then solve it against X to get the coefficients and the mean:
a, b, mean = np.linalg.lstsq(A, X)[0]
If you need a proof of this result, have a look at this post.
Example:
>>> import numpy as np
>>> X = [5, 7, 9, 5]
>>> Y = [2, 0, 4, 1]
>>> Z = [7, 2, 4, 6]
>>> A = np.array([Y, Z, [1] * len(X)]).transpose()
>>> a, b, mean = np.linalg.lstsq(A, X)[0]
>>> print(a, b, mean)
0.860082304527 -0.736625514403 8.49382716049