Tensordot Explanation from Numpy documentation - python

I don't udnerstand how tensordot works and I was reading the official documentation but I don't understand at all what is happening there.
a = np.arange(60.).reshape(3,4,5)
b = np.arange(24.).reshape(4,3,2)
c = np.tensordot(a,b, axes=([1,0],[0,1]))
c.shape
(5, 2)
Why is the shape (5, 2)? What exactly is happening?
I also read this article but the answer is confusing me.
In [7]: A = np.random.randint(2, size=(2, 6, 5))
...: B = np.random.randint(2, size=(3, 2, 4))
...:
In [9]: np.tensordot(A, B, axes=((0),(1))).shape
Out[9]: (6, 5, 3, 4)
A : (2, 6, 5) -> reduction of axis=0
B : (3, 2, 4) -> reduction of axis=1
Output : `(2, 6, 5)`, `(3, 2, 4)` ===(2 gone)==> `(6,5)` + `(3,4)` => `(6,5,3,4)`
Why is the shape (6, 5, 3, 4)?

In [196]: a = np.arange(60.).reshape(3,4,5)
...: b = np.arange(24.).reshape(4,3,2)
...: c = np.tensordot(a,b, axes=([1,0],[0,1]))
In [197]: c
Out[197]:
array([[4400., 4730.],
[4532., 4874.],
[4664., 5018.],
[4796., 5162.],
[4928., 5306.]])
I find the einsum equivalent to be easier to "read":
In [198]: np.einsum('ijk,jil->kl',a,b)
Out[198]:
array([[4400., 4730.],
[4532., 4874.],
[4664., 5018.],
[4796., 5162.],
[4928., 5306.]])
tensordot works by transposing and reshaping the inputs to reduce the problem to a simple dot:
In [204]: a1 = a.transpose(2,1,0).reshape(5,12)
In [205]: b1 = b.reshape(12,2)
In [206]: np.dot(a1,b1) # or a1#b1
Out[206]:
array([[4400., 4730.],
[4532., 4874.],
[4664., 5018.],
[4796., 5162.],
[4928., 5306.]])
tensordot can do further manipulation to the result, but that's not needed here.
I had to try several things before I got a1/b1 right. For example a.transpose(2,0,1).reshape(5,12) produces the right shape, but different values.
yet another version:
In [210]: (a.transpose(1,0,2)[:,:,:,None]*b[:,:,None,:]).sum((0,1))
Out[210]:
array([[4400., 4730.],
[4532., 4874.],
[4664., 5018.],
[4796., 5162.],
[4928., 5306.]])

Related

np.tensordot function, how to multiply a tensor by successive slices of another?

I want to multiply two 3D tensors in a specific way.
The two tensors have shapes T1 = (a,b,c) and T2 = (d,b,c).
What I want is to multiply a times T2 by the successive 'slices' (b,c) of a.
In other words, I want to have the same as this code :
import numpy as np
a=2
b=3
c=4
d=5
T1 = np.random.rand(a,b,c)
T2 = np.random.rand(d,b,c)
L= []
for j in range(a) :
L+=[T1[j,:,:]*T2]
L = np.array(L)
L.shape
I have the iterative solution and I try with axes arguments but I didn't succeed in the second way.
Ok, now I think I got the solution:
a=2
b=3
c=4
d=5
T1 = np.random.rand(a,b,c)
T2 = np.random.rand(d,b,c)
L = np.zeros(shape=(a,d,b,c))
for i1 in range(len(T1)):
for i2 in range(len(T2)):
L[i1,i2] = np.multiply(np.array(T1[i1]),np.array(T2[i2]))
Since the shapes:
In [26]: T1.shape, T2.shape
Out[26]: ((2, 3, 4), (5, 3, 4))
produce a:
In [27]: L.shape
Out[27]: (2, 5, 3, 4)
Let's try a broadcasted pair of arrays:
In [28]: res = T1[:,None]*T2[None,:]
Shape and values match:
In [29]: res.shape
Out[29]: (2, 5, 3, 4)
In [30]: np.allclose(L,res)
Out[30]: True
tensordot, dot, or matmul don't apply; just plain elementwise multiplication, with broadcasting.

Getting unexpected shape while slicing a numpy array

I have a 4D numpy array. While slicing for multiple indices in a single dimension, my axis get interchanged. Am I missing something trivial here.
import numpy as np
from smartprint import smartprint as prints
a = np.random.rand(50, 60, 70, 80)
b = a[:, :, :, [2,3,4]]
prints (b.shape) # this works as expected
c = a[1, :, :, [2,3,4]]
prints (c.shape) # here, I see the axes are interchanged
Output:
b.shape : (50, 60, 70, 3)
c.shape : (3, 60, 70)
Here are some observations that may help explain the problem.
Start with a 3d array, with the expect strides:
In [158]: x=np.arange(24).reshape(2,3,4)
In [159]: x.shape,x.strides
Out[159]: ((2, 3, 4), (48, 16, 4))
Advanced indexing on the last axis:
In [160]: y=x[:,:,[0,1,2,3]]
In [161]: y.shape, y.strides
Out[161]: ((2, 3, 4), (12, 4, 24))
Notice that the strides are not in the normal C-contiguous order. For a 2d array we'd describe this a F-contiguous. It's an obscure indexing detail that usually doesn't matter.
Apparently when doing this indexing it first makes an array with the last, the indexed dimension, first:
In [162]: y.base.shape
Out[162]: (4, 2, 3)
In [163]: y.base.strides
Out[163]: (24, 12, 4)
y is this base with swapped axes, a view of its base.
The case with a slice in the middle is
In [164]: z=x[1,:,[0,1,2,3]]
In [165]: z.shape, z.strides
Out[165]: ((4, 3), (12, 4))
In [166]: z.base # its own base, not a view
Transposing z to the expected (3,4) shape would switch the strides to (4,12), F-contiguous.
With the two step indexing, we get an array with the expect shape, but the F strides. And its base looks a lot like z.
In [167]: w=x[1][:,[0,1,2,3]]
In [168]: w.shape, w.strides
Out[168]: ((3, 4), (4, 12))
In [169]: w.base.shape, w.base.strides
Out[169]: ((4, 3), (12, 4))
The docs justify the switch in axes by saying that there's an ambiguity when performing advanced indexing with a slice in the middle. It's perhaps clearest when using a (2,1) and (4,) indices:
In [171]: w=x[[[0],[1]],:,[0,1,2,3]]
In [172]: w.shape, w.strides
Out[172]: ((2, 4, 3), (48, 12, 4))
The middle, size 3 dimension, is "tacked on last". With x[1,:,[0,1,2,3]] that ambibuity argument isn't as good, but apparently it's using the same indexing method. When this was raised in github issues, the claim was that reworking the indexing to correct this was too difficult. Individual cases might be corrected, but a comprehensive change was too complicated.
This dimension switch seems to come up on SO a couple of times a year, an annoyance, but not a critical issue.

How many addition operations are being performed by np.sum()?

Lets consider I have an array of shape (1, 3, 4, 4) and I apply numpy.sum() on this and reduce against axes [2,3]. Below is a sample code --
import numpy as np
data = np.random.rand(1, 3, 4, 4)
res = np.sum(data, axis=(2,3), keepdims=True)
How many addition operations are being done by np.sum()?
In [202]: data = np.arange(3*4*4).reshape(1,3,4,4)
do your sum:
In [203]: res = np.sum(data, axis=(2,3), keepdims=True)
In [204]: res
Out[204]:
array([[[[120]],
[[376]],
[[632]]]])
In [205]: res.shape
Out[205]: (1, 3, 1, 1)
to produce each of the 3 sums:
In [207]: for i in range(3):
...: print(data[0,i].sum())
...:
120
376
632
And in a more detailed simulation (for one of those 3):
In [208]: tot=0
...: for i in range(4):
...: for j in range(4):
...: tot += data[0,0,i,j]
...:
In [209]: tot
Out[209]: 120
I'll let you count the +=.

Multidimensional array in numpy

I have an array of shape (5,2) which each row consist of an array of shape (4,3,2) and a float number.
After I slice that array[:,0], I get an array of shape (5,) which each element has shape of (4,3,2), instead of an array of shape (5,4,3,2) (even if I'd use np.array()).
Why?
Edited
Example:
a1 = np.arange(50).reshape(5, 5, 2)
a2 = np.arange(50).reshape(5, 5, 2)
b1 = 15.0
b2 = 25.0
h = []
h.append(np.array([a1, b1]))
h.append(np.array([a2, b2]))
h = np.array(h)[:,0]
np.shape(h) # (2,)
np.shape(h[0]) # (5, 5, 2)
np.shape(h[1]) # (5, 5, 2)
h = np.array(h)
np.shape(h) # (2,) Why not (2, 5, 5, 2)?
You have an array of objects; You can use np.stack to convert it to the shape you need if you are sure all the sub elements have the same shape:
np.stack(a[:,0])
a = np.array([[np.arange(24).reshape(4,3,2), 1.]]*5)
a.shape
# (5, 2)
a[:,0].shape
# (5,)
a[:,0][0].shape
# (4, 3, 2)
np.stack(a[:,0]).shape
# (5, 4, 3, 2)
In [121]: a1.dtype, a1.shape
Out[121]: (dtype('int32'), (5, 5, 2))
In [122]: c1 = np.array([a1,b1])
In [123]: c1.dtype, c1.shape
Out[123]: (dtype('O'), (2,))
Because a1 and b1 are different shaped objects (b1 isn't even an array), an array made from them will have dtype object. And the h made from several continues to be object dtype.
In [124]: h = np.array(h)
In [125]: h.dtype, h.shape
Out[125]: (dtype('O'), (2, 2))
In [126]: h[:,1]
Out[126]: array([15.0, 25.0], dtype=object)
In [127]: h[:,0].dtype
Out[127]: dtype('O')
After the appends, h (as an array) is object dtype. The 2nd column is the b1 and b2 values, the 1st column the a1 and a2.
Some form of concatenate is required to combine those a1 a2 arrays into one. stack does it on a new axis.
In [128]: h[0,0].shape
Out[128]: (5, 5, 2)
In [129]: np.array(h[:,0]).shape # np.array doesn't cross the object boundary
Out[129]: (2,)
In [130]: np.stack(h[:,0]).shape
Out[130]: (2, 5, 5, 2)
In [131]: np.concatenate(h[:,0],0).shape
Out[131]: (10, 5, 2)
Turning the (2,) array into a list, does allow np.array to recombine the elements into a higher dimensional array, just as np.stack does:
In [133]: np.array(list(h[:,0])).shape
Out[133]: (2, 5, 5, 2)
You appear to believe that Numpy can magically divine your intent. As #Barmar explains in the comments, when you slice a shape(5,2) array with [:, 0] you get all rows of the first column of that array. Each element of that slice is a shape(4,3,2) array. Numpy is giving you exactly what you asked for.
If you want to convert that into a shape(5,4,3,2) array you'll need to perform further processing to extract the elements from the shape(4,3,2) arrays.

Matrix multiplication of inner dimensions of 3D tensors?

I have two numpy matrices of dimension (386, 3, 4) and (386, 4, 3). I want to produce an output dimension of (386, 3, 3). In other words, I wish to execute the following loop in a vectorized fashion -
for i in range(len(input1)):
output[i] = np.matmul(input1[i], input2[i])
What's the best way to do this?
matmul also works:
a = np.random.random((243,3,4))
b = np.random.random((243,4,3))
np.matmul(a,b).shape
# (243, 3, 3)
We need to keep the first axes aligned, so I would suggest using an approach with np.einsum -
np.einsum('ijk,ikl->ijl',input1,input2)
Sample run to verify shapes -
In [106]: a = np.random.rand(386, 3, 4)
In [107]: b = np.random.rand(386, 4, 3)
In [108]: np.einsum('ijk,ikl->ijl',a,b).shape
Out[108]: (386, 3, 3)
Sample run to verify values on smaller input -
In [174]: a = np.random.rand(2, 3, 4)
In [175]: b = np.random.rand(2, 4, 3)
In [176]: output = np.zeros((2,3,3))
In [177]: for i in range(len(a)):
...: output[i] = np.matmul(a[i], b[i])
...:
In [178]: output
Out[178]:
array([[[ 1.43473795, 0.860279 , 1.17855877],
[ 1.91036828, 1.23063125, 1.5319063 ],
[ 1.06489098, 0.86868941, 0.84986621]],
[[ 1.07178572, 1.020091 , 0.63070531],
[ 1.34033495, 1.26641131, 0.79911685],
[ 1.68916831, 1.63009854, 1.14612462]]])
In [179]: np.einsum('ijk,ikl->ijl',a,b)
Out[179]:
array([[[ 1.43473795, 0.860279 , 1.17855877],
[ 1.91036828, 1.23063125, 1.5319063 ],
[ 1.06489098, 0.86868941, 0.84986621]],
[[ 1.07178572, 1.020091 , 0.63070531],
[ 1.34033495, 1.26641131, 0.79911685],
[ 1.68916831, 1.63009854, 1.14612462]]])
Sample run to verify values on bigger input -
In [180]: a = np.random.rand(386, 3, 4)
In [181]: b = np.random.rand(386, 4, 3)
In [182]: output = np.zeros((386,3,3))
In [183]: for i in range(len(a)):
...: output[i] = np.matmul(a[i], b[i])
...:
In [184]: np.allclose(np.einsum('ijk,ikl->ijl',a,b), output)
Out[184]: True

Categories

Resources