I am new to Python. I have a trouble doing matrix multiplication. I have two lists:
A =[3.0,3.0]
# 1 by 2 matrix
B =[[ 50.33112583, -49.66887417],
[-49.66887417, 50.33112583]]
# 2 by 2 matrix
Result should be :
# 1 by 2 matrix
c = [1.9867549668874176, 1.986754966887446]
Right now I am doing:
>> A = numpy.matrix(A)
>> B = numpy.matrix(B)
>> C =A * B
>> C
matrix([[ 1.98675497, 1.98675497]])
>>C.tolist()
[[1.9867549668874176, 1.986754966887446]]
If I do dot product then,
>>> B =numpy.array(B)
>>> B
array([[ 50.33112583, -49.66887417],
[-49.66887417, 50.33112583]])
>>> A = [ 3., 3.]
>>> A =numpy.array(A)
>>> A
array([ 3., 3.])
>>> C = numpy.dot(A,B)
>>> C
array([ 1.98675497, 1.98675497])
>>> C.tolist()
[1.9867549668874176, 1.986754966887446]
Why I am getting two brackets when I use matrix multiplication?? Whether dot product and matrix multiplication are same here? Can some one explain me this??
When you use np.matrix() it is by definition a 2-D container and the operations must be performed between 2-D entities and will return 2-D entities:
np.matrix([[1,2,3], [4,5,6]])*[[1], [2], [3]]
#matrix([[14],
# [32]])
np.matrix([[1,2,3], [4,5,6]])*[1, 2, 3]
#ValueError
When you use a np.array() in tha case of dot() between two 2-D arrays the result is a 2-D array; while between a 2-D array and a 1-D array the result is a 1-D array:
np.array([[1,2,3], [4,5,6]]).dot([[1], [2], [3]])
#array([[14],
# [32]])
np.array([[1,2,3], [4,5,6]]).dot([1, 2, 3])
#array([14, 32])
More complex and flexible broadcasting rules for arrays are available when element-wise operations are desired. Here is how each row can be multiplied by a different scalar:
np.array([[1,2,3], [4,5,6]])*[[1], [2]]
#array([[ 1, 2, 3],
# [ 8, 10, 12]])
and how each column can be multiplied by a different scalar:
np.array([[1,2,3], [4,5,6]])*[1, 2, 3]
#array([[ 1, 4, 9],
# [ 4, 10, 18]])
Related
Please assume a vector of invertible matrices:
import numpy as np
a = np.arange(120).reshape((2, 2, 5, 6))
I want to invert the matrices over their defined axes:
b = np.linalg.inv(a, axis1=0, axis2=1)
but this does not seems supported.
How to achieve this?
inv docs specifies its array input as:
a : (..., M, M) array_like
Matrix to be inverted.
You have a
a = np.arange(120).reshape((2, 2, 5, 6))
(M,M,...)
The dimensions are in the wrong order - change them!
In [44]: a = np.arange(120).reshape((2, 2, 5, 6))
Change the axes to the order that inv accepts:
In [45]: A = a.transpose(2,3,0,1)
In [46]: Ai = np.linalg.inv(A)
In [47]: Ai.shape
Out[47]: (5, 6, 2, 2)
In [48]: ai = Ai.transpose(2,3,0,1) # and back
In [49]: ai.shape
Out[49]: (2, 2, 5, 6)
I was going to test the result, but got:
In [50]: x = a#ai
Traceback (most recent call last):
File "<ipython-input-50-9dfe3616745d>", line 1, in <module>
x = a#ai
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 5 is different from 6)
Like inv, matmul treats the last 2 dimensions as the matrix, the first 2 as 'batch':
In [51]: x = A#Ai
In [52]: x[0,0]
Out[52]:
array([[1., 0.],
[0., 1.]])
In [53]: x[0,3]
Out[53]:
array([[1.00000000e+00, 1.38777878e-17],
[4.44089210e-16, 1.00000000e+00]])
We can do the equivalent with einsum:
In [55]: x = np.einsum('ijkl,jmkl->imkl',a,ai)
In [56]: x[:,:,0,0]
Out[56]:
array([[1., 0.],
[0., 1.]])
You might want to change the original specification to match the inv and matmul usage. It could make life easier for you. Also remember that in numpy the trailing dimensions are the inner most ones.
If you know that the matrices are 2x2 you can do that easily using the standard formula for inverting such matrices; otherwise, I fear the only reasonable solution would be to do it with for loops? For example, the following works for any shape (modifying the sizes adequately):
b = np.stack([np.linalg.inv(a[:, :, i, j]) for i in range(a.shape[2]) for j in range(a.shape[3])], axis=2)
b = b.reshape(2, 2, 5, 6)
as checked by
for i in range(a.shape[2]):
for j in range(a.shape[3]):
assert np.allclose(np.dot(a[:,:,i,j], b[:,:,i,j]), np.eye(2))
In the specific 2x2 case you can do the following, which is fully vectorized hence probably faster:
determinants = a[0, 0] * a[1, 1] - a[0, 1] * a[1, 0]
b = 1 / determinants * np.stack([
np.stack([a[1, 1], -a[0, 1]]),
np.stack([-a[1, 0], a[0, 0]]),
])
On the specific (small) input size, the second solution is about 10 times faster in my tests (43us vs. 537us).
For a given 2D tensor I want to retrieve all indices where the value is 1. I expected to be able to simply use torch.nonzero(a == 1).squeeze(), which would return tensor([1, 3, 2]). However, instead, torch.nonzero(a == 1) returns a 2D tensor (that's okay), with two values per row (that's not what I expected). The returned indices should then be used to index the second dimension (index 1) of a 3D tensor, again returning a 2D tensor.
import torch
a = torch.Tensor([[12, 1, 0, 0],
[4, 9, 21, 1],
[10, 2, 1, 0]])
b = torch.rand(3, 4, 8)
print('a_size', a.size())
# a_size torch.Size([3, 4])
print('b_size', b.size())
# b_size torch.Size([3, 4, 8])
idxs = torch.nonzero(a == 1)
print('idxs_size', idxs.size())
# idxs_size torch.Size([3, 2])
print(b.gather(1, idxs))
Evidently, this does not work, leading to aRunTimeError:
RuntimeError: invalid argument 4: Index tensor must have same
dimensions as input tensor at
C:\w\1\s\windows\pytorch\aten\src\TH/generic/THTensorEvenMoreMath.cpp:453
It seems that idxs is not what I expect it to be, nor can I use it the way I thought. idxs is
tensor([[0, 1],
[1, 3],
[2, 2]])
but reading through the documentation I don't understand why I also get back the row indices in the resulting tensor. Now, I know I can get the correct idxs by slicing idxs[:, 1] but then still, I cannot use those values as indices for the 3D tensor because the same error as before is raised. Is it possible to use the 1D tensor of indices to select items across a given dimension?
You could simply slice them and pass it as the indices as in:
In [193]: idxs = torch.nonzero(a == 1)
In [194]: c = b[idxs[:, 0], idxs[:, 1]]
In [195]: c
Out[195]:
tensor([[0.3411, 0.3944, 0.8108, 0.3986, 0.3917, 0.1176, 0.6252, 0.4885],
[0.5698, 0.3140, 0.6525, 0.7724, 0.3751, 0.3376, 0.5425, 0.1062],
[0.7780, 0.4572, 0.5645, 0.5759, 0.5957, 0.2750, 0.6429, 0.1029]])
Alternatively, an even simpler & my preferred approach would be to just use torch.where() and then directly index into the tensor b as in:
In [196]: b[torch.where(a == 1)]
Out[196]:
tensor([[0.3411, 0.3944, 0.8108, 0.3986, 0.3917, 0.1176, 0.6252, 0.4885],
[0.5698, 0.3140, 0.6525, 0.7724, 0.3751, 0.3376, 0.5425, 0.1062],
[0.7780, 0.4572, 0.5645, 0.5759, 0.5957, 0.2750, 0.6429, 0.1029]])
A bit more explanation about the above approach of using torch.where(): It works based on the concept of advanced indexing. That is, when we index into the tensor using a tuple of sequence objects such as tuple of tensors, tuple of lists, tuple of tuples etc.
# some input tensor
In [207]: a
Out[207]:
tensor([[12., 1., 0., 0.],
[ 4., 9., 21., 1.],
[10., 2., 1., 0.]])
For basic slicing, we would need a tuple of integer indices:
In [212]: a[(1, 2)]
Out[212]: tensor(21.)
To achieve the same using advanced indexing, we would need a tuple of sequence objects:
# adv. indexing using a tuple of lists
In [213]: a[([1,], [2,])]
Out[213]: tensor([21.])
# adv. indexing using a tuple of tuples
In [215]: a[((1,), (2,))]
Out[215]: tensor([21.])
# adv. indexing using a tuple of tensors
In [214]: a[(torch.tensor([1,]), torch.tensor([2,]))]
Out[214]: tensor([21.])
And the dimension of the returned tensor would always be one dimension less than the dimension of the input tensor.
Assuming that b's three dimensions are batch_size x sequence_length x features (b x s x feats), the expected results can be achieved as follows.
import torch
a = torch.Tensor([[12, 1, 0, 0],
[4, 9, 21, 1],
[10, 2, 1, 0]])
b = torch.rand(3, 4, 8)
print(b.size())
# b x s x feats
idxs = torch.nonzero(a == 1)[:, 1]
print(idxs.size())
# b
c = b[torch.arange(b.size(0)), idxs]
print(c.size())
# b x feats
import torch
a = torch.Tensor([[12, 1, 0, 0],
[4, 9, 21, 1],
[10, 2, 1, 0]])
b = torch.rand(3, 4, 8)
print('a_size', a.size())
# a_size torch.Size([3, 4])
print('b_size', b.size())
# b_size torch.Size([3, 4, 8])
#idxs = torch.nonzero(a == 1, as_tuple=True)
idxs = torch.nonzero(a == 1)
#print('idxs_size', idxs.size())
print(torch.index_select(b,1,idxs[:,1]))
As a supplementary of #kmario23's solution, you can still achieve the same results like
b[torch.nonzero(a==1,as_tuple=True)]
I got an array and reshaped it to the following dimentions: (-1,1,1,1) and (-1,1):
Array A:
[-0.888788523827 0.11842529285 0.319928774626 0.319928774626 0.378755429421 1.225877519716 3.830653798838]
A.reshape(-1,1,1,1):
[[[[-0.888788523827]]]
[[[ 0.11842529285 ]]]
[[[ 0.319928774626]]]
[[[ 0.319928774626]]]
[[[ 0.378755429421]]]
[[[ 1.225877519716]]]
[[[ 3.830653798838]]]]
A.reshape(-1,1):
[[-0.888788523827]
[ 0.11842529285 ]
[ 0.319928774626]
[ 0.319928774626]
[ 0.378755429421]
[ 1.225877519716]
[ 3.830653798838]]
Then I have done substractig and broadcasting came in, so my resulting matrix is 7x1x7x1.
I have a hard time to visualize the intermediate step what broadcasting does. I mean I cannot imagine what elements of arrays are repeated and what they look like while broadcasting.
Could somebody shed some light on this problem,please?
In [5]: arr = np.arange(4)
In [6]: A = arr.reshape(-1,1,1,1)
In [7]: B = arr.reshape(-1,1)
In [8]: C = A + B
In [9]: C.shape
Out[9]: (4, 1, 4, 1)
In [10]: A.shape
Out[10]: (4, 1, 1, 1)
In [11]: B.shape
Out[11]: (4, 1)
There are 2 basic broadcasting rules:
expand the dimensions to match - by adding size 1 dimensions at the start
adjust all size 1 dimensions to match
So in this example:
(4,1,1,1) + (4,1)
(4,1,1,1) + (1,1,4,1) # add 2 size 1's to B
(4,1,4,1) + (4,1,4,1) # adjust 2 of the 1's to 4
(4,1,4,1)
The first step is, perhaps, the most confusing. The (4,1) is expanded to (1,1,4,1), not (4,1,1,1). The rule is intended to avoid ambiguity - by expanding in a consistent manner, not necessarily what a human might intuitively want.
Imagine the case where both arrays need expansion to match, and it could add a dimension in either direction:
(4,) and (3,)
(1,4) and (3,1) or (4,1) and (1,3)
(3,4) or (4,3)
confusion
The rule requires that the programmer choose which one expands to the right (4,1) or (3,1). numpy can then unambiguously add the other.
For a simpler example:
In [22]: A=np.arange(3).reshape(-1,1)
In [23]: B=np.arange(3)
In [24]: C = A+B (3,1)+(3,) => (3,1)+(1,3) => (3,3)
In [25]: C
Out[25]:
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])
In [26]: C.shape
Out[26]: (3, 3)
The [0,2,4] are present, but on the diagonal of C.
When broadcasting like this, the result is a kind of outer sum:
In [27]: np.add.outer(B,B)
Out[27]:
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])
Let's assume I have 2 matrices which each of them represents vector:
X = np.matrix([[1],[2],[3]])
Y = np.matrix([[4],[5],[6]])
I want the output to be the result of multiplying it element by element, which means it should be:
[[4],[10],[18]]
Note that it is np.matrix and not np.array
Tested np.multiply() on ipython and it worked like a charm
In [41]: X = np.matrix([[1],[2],[3]])
In [42]: Y = np.matrix([[4],[5],[6]])
In [43]: np.multiply(X, Y)
Out[43]:
matrix([[ 4],
[10],
[18]])
so remember that NumPy matrix is a subclass of NumPy array, and array operations are element-wise.
therefore, you can convert your matrices to NumPy arrays, then multiply them with the "*" operator, which will be element-wise:
>>> import numpy as NP
>>> X = NP.matrix([[1],[2],[3]])
>>> Y = NP.matrix([[4],[5],[6]])
>>> X1 = NP.array(X)
>>> Y1 = NP.array(Y)
>>> XY1 = X1 * Y1
array([[ 4],
[10],
[18]])
>>> XY = matrix(XY1)
>>> XY
matrix([[ 4],
[10],
[18]])
alternatively you can use a generic function for element-wise multiplication:
>>> a = NP.matrix("4 5 7; 9 3 2; 3 9 1")
>>> b = NP.matrix("5 2 9; 8 4 2; 1 7 4")
>>> ab = NP.multiply(a, b)
>>> ab
matrix([[20, 10, 63],
[72, 12, 4],
[ 3, 63, 4]])
these two differ in the return type and so you probably want to choose the first if the next function in your data flow requires a NumPy array; if it requires a NumPy matrix, then the second
Consider the following vectors (essentially2x1 matrices):
a = sc.array([[1], [2], [3]])
>>> a
[[1]
[2]
[3]]
b = sc.array([[4], [5], [6]])
>>> b
[[4]
[5]
[6]]
The cross product of these vectors can be calculated using numpy.cross(). Why does this not work:
import numpy as np
np.cross(a, b)
ValueError: incompatible dimensions for cross product
(dimension must be 2 or 3)
but this does?:
np.cross(a.T, b.T)
[[-3 6 -3]]
To compute the cross product using numpy.cross, the dimension (length) of the array dimension which defines the two vectors must either by two or three. To quote the documentation:
If a and b are arrays of vectors, the vectors
are defined by the last axis of a and b by default, and these axes
can have dimensions 2 or 3.
Note that the last axis is the default. In your example:
In [17]: a = np.array([[1], [2], [3]])
In [18]: b = np.array([[4], [5], [6]])
In [19]: print a.shape,b.shape
(3, 1) (3, 1)
the last axis is only of length 1, so the cross product is not defined. However, if you use the transpose, the length along the last axis is 3, so it is valid. You could also do:
In [20]: np.cross(a,b,axis=0)
Out[20]:
array([[-3],
[ 6],
[-3]])
which tells cross that the vectors are defined along the first axis, rather than the last axis.
In numpy we often use 1d arrays to represent vectors, and we treat it as either a row vector or a column vector depending on the context, for example:
In [13]: a = np.array([1, 2, 3])
In [15]: b = np.array([4, 5, 6])
In [16]: np.cross(a, b)
Out[16]: array([-3, 6, -3])
In [17]: np.dot(a, b)
Out[17]: 32
You can store vectors as 2d arrays, this is most useful when you have a collection of vectors you want to treat in a similar way. For example if I want to cross 4 vectors in a with 4 vectors in b. By default numpy assumes the vectors are along the last dimensions but you can use the axisa and axisb arguments to explicitly specify that the vectors are along the first dimension.
In [26]: a = np.random.random((3, 4))
In [27]: b = np.random.random((3, 4))
In [28]: np.cross(a, b, axisa=0, axisb=0)
Out[28]:
array([[-0.34780508, 0.54583745, -0.25644455],
[ 0.03892861, 0.18446659, -0.36877085],
[ 0.36736545, 0.13549752, -0.32647531],
[-0.46253185, 0.56148668, -0.10056834]])
You should create a and b like this:
a = sc.array([1, 2, 3])
b = sc.array([4, 5, 6])
so that they have dimension = 3.