Kindly take a look at the code below.
import numpy as np
a = np.random.rand(10)
b = np.random.rand(10)
c = np.dot(a,b)
a and b are arrays of dimension (10,). The dot product is however a scalar value.
By following the matrix multiplication rule (mXn * nXp = mXp), I expected an array of dimension (10,). Can you please explain?
Related
Say you have an array X of shape (n,),
import numpy as np
n = 10
X = np.random.rand(n)
and you want to make the following dot product XX^T (by X^T I mean the transpose of X). The result should give an n by n matrix. However using
np.dot(X, X.T)
will give a scalar. It's like if it does X^T X instead. Unless you do the following
X = np.reshape(X, (X.shape[0], 1))
np.dot(X, X.T)
Is there a way to do it without having to reshape the numpy vector?
If both a and b are 1-D arrays, numpy.dot(a, b) returns the inner product of vectors (without complex conjugation).
You can use the numpy.outer function instead:
np.outer(X, X)
I am confused about the following example of a matrix tensor multiplication that returns a vector. At first glance I thought that it would mean multiplying the first dimension of the tensor dydx by the matrix dLdy but I don't get the expected results as depicted below. So what is the meaning of this einsum ?
import torch
import numpy as np
dLdy = torch.randn(2,2)
dydx = torch.randn(2,2,2)
torch.einsum('jk,jki->i', dLdy, dydx)
tensor([0.3115, 3.7255])
dLdy
tensor([[-0.4845, 0.6838],
[-1.1723, 1.4914]])
dydx
tensor([[[ 1.5496, -1.2722],
[ 0.1221, 1.0495]],
[[-1.4882, 0.0307],
[-0.5134, 1.6276]]])
(dLdy * dydx[0]).sum()
-0.1985
For A and B this is contraction (sum) over the first two dimensions jk, so
res(i) = sum_{j,k} A(j,k)B(j,k,i)
for example:
import torch
import numpy as np
dLdy = torch.randn(2,2)
dydx = torch.randn(2,2,2)
print(torch.einsum('jk,jki->i', dLdy, dydx))
print((dLdy * dydx[:,:,0]).sum())
print((dLdy * dydx[:,:,1]).sum())
produces
tensor([4.6025, 1.8987])
tensor(4.6025)
tensor(1.8987)
ie (dLdy * dydx[:,:,0]).sum() is the first element of the resulting vector, etc
I am subtracting 2 numpy.ndarrays h and y with shape of (47,1) and (47,) respectively. When I use python to subtract both of the next functions return an array of shape (47,47). I know that mathematically this operation should keep the dimensions of the input arrays, but its not working that way.
The operations I used are:
e = h - y
e = np.subtract(h,y)
Is that something about how numpy does the operations, or should I be using other types of operations for this? How do I fix it so that the dimensions of the resulting array match with the correct ones mathematically?
The shape of h and y should be identical for elementwise subtraction as you mentioned.
The both methods you describe are identical.
The following code works
import numpy as np
a = np.array([1,2,3,4,5,6,7])
b = np.array([[1,2,3,4,5,6,7]])
print(a.shape) # (7,)
print(b.shape) # (1,7)
c = a-b # or np.subtract(a,b)
print(c.shape) # (1,7)
print(c) # [[0,0,0,0,0,0,0]]
Maybe one of ndarrays is transposed. The shape of a-b.T is (7,7) as you described.
Edit
I forgot the fact that you described a column vector.
In this case the following would do the trick for elementwise subtraction:
h.T-y
Suppose we have two numpy arrays: A with shape (n,p,q), B with shape (n,q,r).
How to multiply them to get an array C with shape (n,p,r)? I mean keep axis 0 and multiply them by axis 1 and 2.
I know it can be computed by:
C = np.stack([np.dot(a[i], b[i]) for i in range(A.shape[0])])
But does there exist a numpy function which can be used to compute it directly?
I think you can do np.einsum:
# sample data
n,p,q,r = 2,3,4,5
A = np.zeros((n,p,q))
B = np.zeros((n,p,r))
out = np.einsum('npq,nqr->npr',A,B)
out.shape
# (2, 3, 5)
I'm trying to understand how numpy works when you try to call the dot product of two row vectors.
I have this code:
X = np.array([[1,2,3]])
THETA = np.array([[1,2,3]])
print X.dot(THETA)
This gives me the error:
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
I thought that you could take the dot product of two row vectors however to get:
x1*theta1 + x2*theta2 + x3*theta3
And this would also transfer to the dot product of two column vectors.
The weird part is, I have to take the transpose of the second matrix in order to actually use the dot product:
print X.dot(THETA.T)
array([[14]])
However, I didn't think this would actually work, and why it would work instead of just doing a row dot row operation. Can anyone help me understand what's going on? Is it some rule in linear algebra that I forgot from long ago?
dot for 2D input is matrix multiplication, not a dot product. What you're seeing is just the result of the normal rules of matrix multiplication. If you want a vector dot product, the easiest way is to use 1D vectors, with no superfluous second dimension:
X = np.array([1, 2, 3])
THETA = np.array([1, 2, 3])
print X.dot(THETA)
dot-ting two 1D arrays takes a dot product and produces a scalar result.
If you want to use row and column vectors, then by the standard rules of matrix multiplication, you need to multiply a 1-by-N array (a row vector) by an N-by-1 array (a column vector) to get a 1-by-1 result, and NumPy will give you a 1-by-1 array rather than a scalar.
The alignment error you're seeing is because you're trying to represent a 1D vector as a 2D array.
In [1]: import numpy as np
In [2]: X = np.array([1,2,3])
In [3]: THETA = np.array([1,2,3])
In [4]: print X.dot(THETA)
14
In [5]: print X.dot(THETA.T)
14
And:
x1*theta1 + x2*theta2 + x3*theta3 =
1*1 + 2*2 + 3*3 =
14