Related
I have an issue in using python with matrix multiplication and reshape. for example, I have a column S of size (16,1) and another matrix H of size (4,4), I need to reshape the column S into (4,4) in order to multiply it with H and then reshape it again into (16,1), I did that in matlab as below:
clear all; clc; clear
H = randn(4,4,16) + 1j.*randn(4,4,16);
S = randn(16,1) + 1j.*randn(16,1);
for ij = 1 : 16
y(:,:,ij) = reshape(H(:,:,ij)*reshape(S,4,[]),[],1);
end
y = mean(y,3);
Coming to python :
import numpy as np
H = np.random.randn(4,4,16) + 1j * np.random.randn(4,4,16)
S = np.random.randn(16,) + 1j * np.random.randn(16,)
y = np.zeros((4,4,16),dtype=complex)
for ij in range(16):
y[:,:,ij] = np.reshape(h[:,:,ij]#S.reshape(4,4),16,1)
But I get an error here that we can't reshape the matrix y of size 256 into 16x1.
Does anyone have an idea about how to solve this problem?
Simply do this:
S.shape = (4,4)
for ij in range(16):
y[:,:,ij] = H[:,:,ij] # S
S.shape = -1 # equivalent to 16
np.dot operates over the last and second-to-last axis of the two operands if they have two or more axes. You can move your axes around to use this.
Keep in mind that reshape(S, 4, 4) in Matlab is likely equivalent to S.reshape(4, 4).T in Python.
So given H of shape (4, 4, 16) and S of shape (16,), you can multiply each channel of H by a reshaped S using
np.moveaxis(np.dot(np.moveaxis(H, -1, 0), S.reshape(4, 4).T), 0, -1)
The inner moveaxis call makes H into (16, 4, 4) for easy multiplication. The outer one reverses the effect.
Alternatively, you could use the fact that S will be transposed to write
np.transpose(S.reshape(4, 4), np.transpose(H))
There are two issues in your solution
1) reshape method takes a shape in the form of a single tuple argument, but not multiple arguments.
2) The shape of your y-array should be 16x1x16, not 4x4x16. In Matlab, there is no issue since it automatically reshapes y as you update it.
The correct version would be the following:
import numpy as np
H = np.random.randn(4,4,16) + 1j * np.random.randn(4,4,16)
S = np.random.randn(16,) + 1j * np.random.randn(16,)
y = np.zeros((16,1,16),dtype=complex)
for ij in range(16):
y[:,:,ij] = np.reshape(H[:,:,ij]#S.reshape((4,4)),(16,1))
When I have a tensor m of shape [12, 10] and a vector s of scalars with shape [12], how can I multiply each row of m with the corresponding scalar in s?
You need to add a corresponding singleton dimension:
m * s[:, None]
s[:, None] has size of (12, 1) when multiplying a (12, 10) tensor by a (12, 1) tensor pytorch knows to broadcast s along the second singleton dimension and perform the "element-wise" product correctly.
You can broadcast a vector to a higher dimensional tensor like so:
def row_mult(input, vector):
extra_dims = (1,)*(input.dim()-1)
return t * vector.view(-1, *extra_dims)
A slighty hard to understand at first, but very powerful technique is to use Einstein summation:
torch.einsum('i,ij->ij', s, m)
Shai's answer works if you know the number of dimensions in advance and can hardcode the correct number of None's. This can be extended to extra dimentions is required:
mask = (torch.rand(12) > 0.5).int()
data = (torch.rand(12, 2, 3, 4))
result = data * mask[:,None,None,None]
result.shape # torch.Size([12, 2, 3, 4])
mask[:,None,None,None].shape # torch.Size([12, 1, 1, 1])
If you are dealing with data of variable or unknown dimensions, then it may require manually extending mask to the correct shape
mask = (torch.rand(12) > 0.5).int()
while mask.dim() < data.dim(): mask.unsqueeze_(1)
result = data * mask
result.shape # torch.Size([12, 2, 3, 4])
mask.shape # torch.Size([12, 1, 1, 1])
This is a bit of an ugly solution, but it does work. There is probably a much more elegant way to correctly reshape the mask tensor inline for a variable number of dimensions
I was wondering if there's a way to compute multiple outer products and stack the results in a single operation.
Say I have an Nx1 vector and take the outer product with a 1xM vector, the result will be an NxM matrix.
What if I had an NxR matrix A, and an RxM matrix B. Is it possible to construct an NxMxR matrix where each layer of the output matrix is the outer product of the corresponding column of A and row of B?
I know it's really easy to do this in a single for loop over R, but I wanted to know if there's a faster way using numpy builtins (as there usually is when numpy is concerned).
I haven't been able to figure out a set of indices that work with einsum (and I'm not even sure if einsum is the right approach, since there is no summation involved here)
Yes, of course, using broadcasting or Einsum (the fact that there is no summation does not matter)
N, M, R = 8, 9, 16
A = numpy.random.rand(N)
B = numpy.random.rand(M)
C = A[:, None] * B[None, :]
D = numpy.einsum('a,b->ab', A, B)
numpy.allclose(C, D)
# True
C.shape
# (8, 9)
A = numpy.random.rand(N, R)
B = numpy.random.rand(M, R)
C = A[:, None, :] * B[None, :, :]
D = numpy.einsum('ar,br->abr', A, B)
numpy.allclose(C, D)
# True
C.shape
# (8, 9, 16)
The Problem:
I want to calculate the dot product of a very large set of data. I am able to do this in a nested for-loop, but this is way too slow.
Here is a small example:
import numpy as np
points = np.array([[0.5, 2, 3, 5.5, 8, 11], [1, 2, -1.5, 0.5, 4, 5]])
lines = np.array([[0, 2, 4, 6, 10, 10, 0, 0], [0, 0, 0, 0, 0, 4, 4, 0]])
x1 = lines[0][0:-1]
y1 = lines[1][0:-1]
L1 = np.asarray([x1, y1])
# calculate the relative length of the projection
# of each point onto each line
a = np.diff(lines)
b = points[:,:,None] - L1[:,None,:]
print(a.shape)
print(b.shape)
[rows, cols, pages] = np.shape(b)
Z = np.zeros((cols, pages))
for k in range(cols):
for l in range(pages):
Z[k][l] = a[0][l]*b[0][k][l] + a[1][l]*b[1][k][l]
N = np.linalg.norm(a, axis=0)**2
relativeProjectionLength = np.squeeze(np.asarray(Z/N))
In this example, the first two dimensions of both a and b represent the x- and y-coordinates that I need for the dot product.
The shape of a is (2,7) and b has (2,6,7). Since the dot product reduces the first dimension I would expect the result to be of the shape (6,7). How can I calculate this without the slow loops?
What I have tried:
I think that numpy.dot with correct broadcasting could do the job, however I have trouble setting up the dimensions correctly.
a = a[:, None, :]
Z = np.dot(a,b)
This on gives me the following error:
shapes (2,1,7) and (2,6,7) not aligned: 7 (dim 2) != 6 (dim 1)
You can use np.einsum -
np.einsum('ij,ikj->kj',a,b)
Explanation :
Keep the last axes aligned for the two inputs.
Sum-reduce the first from those.
Let the rest stay, which is the second axis of b.
Usual rules on whether to use einsum or stick to a loopy-dot based method apply here.
numpy.dot does not reduce the first dimension. From the docs:
For N dimensions it is a sum product over the last axis of a and the second-to-last of b:
dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
That is exactly what the error is telling you: it is attempting to match axis 2 in the first vector to axis 1 in the second.
You can fix this using numpy.rollaxis or better yet numpy.moveaxis. Instead of a = a[:, None, :], do
a = np.movesxis(a, 0, -1)
b = np.moveaxis(b, 0, -2)
Z = np.dot(a, b)
Better yet, you can construct your arrays to have the correct shape up front. For example, transpose lines and do a = np.diff(lines, axis=0).
I want to use tensordot to compute the dot product of a specific dim of two tensors. Like:
A is a tensor, whose shape is (3, 4, 5)
B is a tensor, whose shape is (3, 5)
I want to do a dot use A's third dim and B's second dim, and get a output whose dims is (3, 4)
Like below:
for i in range(3):
C[i] = dot(A[i], B[i])
How to do it by tensordot?
Well, do you want this in numpy or in Theano?
In the case, where, as you state, you would like to contract axis 3 of A against axis 2 of B, both are straightforward:
import numpy as np
a = np.arange(3 * 4 * 5).reshape(3, 4, 5).astype('float32')
b = np.arange(3 * 5).reshape(3, 5).astype('float32')
result = a.dot(b.T)
in Theano this writes as
import theano.tensor as T
A = T.ftensor3()
B = T.fmatrix()
out = A.dot(B.T)
out.eval({A: a, B: b})
however, the output then is of shape (3, 4, 3). Since you seem to want an output of shape (3, 4), the numpy alternative uses einsum, like so
einsum_out = np.einsum('ijk, ik -> ij', a, b)
However, einsum does not exist in Theano. So the specific case here can be emulated as follows
out = (a * b[:, np.newaxis]).sum(2)
which can also be written in Theano
out = (A * B.dimshuffle(0, 'x', 1)).sum(2)
out.eval({A: a, B: b})
In this specific case, einsum is probably easier to understand than tensordot. For example:
c = np.einsum('ijk,ik->ij', a, b)
I'm going to over-simplify the explanation a bit to make things more immediately understandable. We have two input arrays (separated by the comma) and this yields our output array (to the right of the ->).
a has shape 3, 4, 5 and we'll refer to it as ijk
b has shape 3, 5 (ik)
We want the output c to have shape 3, 4 (ij)
Seems a bit magical, right? Let's break that down a bit.
The letters we "lose" as we cross the -> are axes that will be summed over. That's what dot is doing, as well.
We want output with shape 3, 4, so we're eliminating k
Therefore, the output c should be ij
This means we'll refer to b as ik.
As a full example:
import numpy as np
a = np.random.random((3, 4, 5))
b = np.random.random((3, 5))
# Looping through things
c1 = []
for i in range(3):
c1.append(a[i].dot(b[i]))
c1 = np.array(c1)
# Using einsum instead
c2 = np.einsum('ijk,ik->ij', a, b)
assert np.allclose(c1, c2)
You can do this with tensordot as well. I'll add an example of that as soon as I have a bit more time. (Of course, if anyone else would like to add a tensordot example as another answer in the meantime, feel free!)