understand numpy.dot with N-d array and 1-d array [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
To who voted to close because of unclear what I'm asking, here are the questions in my post:
Can anyone tell me what's the result of y?
Is there anything called sum product in Mathematics?
Is x subject to broadcasting?
Why is y a column/row vector?
What if x=np.array([[7],[2],[3]])?
w=np.array([[1,2,3],[4,5,6],[7,8,9]])
x=np.array([7,2,3])
y=np.dot(w,x)
Can anyone tell me what's the result of y?
I deliberately Mosaic the screenshot so that you pretend you are in a test and cannot run python to get the result.
https://docs.scipy.org/doc/numpy-1.15.4/reference/generated/numpy.dot.html#numpy.dot says
If a is an N-D array and b is a 1-D array, it is a sum product over
the last axis of a and b.
Is there anything called sum product in Mathematics?
Is x subject to broadcasting?
Why is y a column/row vector?
What if x=np.array([[7],[2],[3]])?

np.dot is nothing but matrix multiplication if the dimensions match for multiplication (i.e. w is 3x3 and x is 1x3, so matrix multiplication of WX cannot be made but XW is okay). In first case:
>>> w=np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> x=np.array([7,2,3])
>>> w.shape
(3, 3)
>>> x.shape # 1d vector
(3, )
So in this case it returns the inner product of each row of W with X:
>>> [np.dot(ww,x) for ww in w]
[20, 56, 92]
>>> np.dot(w,x)
array([20, 56, 92]) # as they are both same
change the order
>>> y = np.dot(x,w) # matrix mult as usual
>>> y
array([36, 48, 60])
In second case:
>>> x=np.array([[7],[2],[3]])
>>> x.shape
(3, 1)
>>> y = np.dot(w,x) # matrix mult
>>> y
array([[20],
[56],
[92]])
However, this time dimensions does not match for both multiplication (3x1,3x3) and inner product (1x1,1x3) so it raises error.
>>> y = np.dot(x,w)
Traceback (most recent call last):
File "<ipython-input-110-dcddcf3bedd8>", line 1, in <module>
y = np.dot(x,w)
ValueError: shapes (3,1) and (3,3) not aligned: 1 (dim 1) != 3 (dim 0)

I don't think your question is unclear, but rather overly pedantic.
For example, why are you puzzled by sum product in this nD by 1d case, when the docs use inner product for the 1d by 1d case, and matrix product in the 2d by 2d case? Give yourself some freedom to read it as sum of the products, as done in the inner product.
To make your example clearer, make w rectangular, to better distinguish row actions from column ones:
In [168]: w=np.array([[1,2,3],[4,5,6]])
...: x=np.array([7,2,3])
...:
...:
In [169]: w.shape
Out[169]: (2, 3)
In [170]: x.shape
Out[170]: (3,)
The dot and its equivalent einstein notation:
In [171]: np.dot(w,x)
Out[171]: array([20, 56])
In [172]: np.einsum('ij,j->i',w,x)
Out[172]: array([20, 56])
The sum of the products is being done on the repeated j dimension, without summation on i.
We can do the same thing with broadcasted elementwise multiplication:
In [173]: (w*x[None,:]).sum(axis=1)
Out[173]: array([20, 56])
While this equivalent operation does use broadcasting, it's better not to think of dot in those terms.
matmul gives another description of the same action, adding a dimension to x to form a 2d by 2d matrix product, followed by a squeeze to remove the extra dimension. I don't think dot does that under the covers, but the result is the same.
This may also be called matrix vector multiplication, provided you don't insist on calling the 1d x a row vector or column vector.
Now for a 2d x, with shape (3,1):
In [175]: x2 = x[:,None]
In [176]: x2
Out[176]:
array([[7],
[2],
[3]])
In [177]: x2.shape
Out[177]: (3, 1)
In [178]: np.dot(w,x2)
Out[178]:
array([[20],
[56]])
In [179]: np.einsum('ij,jk->ik',w,x2)
Out[179]:
array([[20],
[56]])
The sum is over j, the last axis of w, and 2nd to the last of x. To do the same with elementwise we have to use broadcasting to generate a 3d outer product, and then do the sum to reduce the dimension back to 2.
In [180]: (w[:,:,None]*x2[None,:,:]).sum(axis=1)
Out[180]:
array([[20],
[56]])
In this example a (2,3) dot (3,1) => (2,1). That's perfectly normal matrix product behavior. In the first (2,3) dot (3,) => (2,). To me this is a logical generalization. (3,) dot (3,) => scalar (as opposed to ()` is a bit more of a special case.
I suspect the first case is mainly a problem for people who see a (3,) shape and think (1,3), a row-vector. (2,3) dot (1,3) doesn't work, because of the mismatch between the 3 and the 1.

Related

How can I turn a 1x3 matrix into a one dimensional array

I'm making a function to calculate a dot product when given two vectors. The code is later used in a matrix multiplication function. The issue I'm having is that the parameters passed in from the matrix multiplication function are 1x3 matrices that in order to multiply together, I need to use dot+=A[0,,i]*B[0,i]. The submission website expects dot+=A[i],B[i] and I'm not sure how to get my matrices converted to arrays. Picture of my two functions
I tried recreating the matrices and isolating rows but I still had [[1,2,3]] come up and mes with my dot product function.
I suggest you to use the numpy library to perform matrix multiplication.
You can use the numpy.matmul function to perform matrix multiplication. Read more about it here: https://numpy.org/doc/stable/reference/generated/numpy.matmul.html
You can also use the numpy.dot function to perform dot product. Read more about it here: https://numpy.org/doc/stable/reference/generated/numpy.dot.html
Here is an example of how to use the numpy library to perform matrix multiplication:
import numpy as np
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
c = np.matmul(a, b)
Glancing at your image, I noticed the use of np.matrix. Be careful with that. That's an array subclass designed years ago to make numpy more friendly to wayward MATLAB users. It's use is discouraged now.
A key difference is that it must always be 2d:
In [108]: M = np.matrix([1,2,3])
In [109]: M
Out[109]: matrix([[1, 2, 3]]) # note the added dimension
In [110]: M.shape
Out[110]: (1, 3)
And ravel doesn't remove that leading dimension:
In [111]: M.ravel()
Out[111]: matrix([[1, 2, 3]])
np.matrix does have a A1 property, which ravels and converts to ndarray, or rather converts to array and ravels:
In [112]: M.A1
Out[112]: array([1, 2, 3])
np.matrix does matrix multiplication with using *, though now # is recommended operator that works with ndarray as well
In [113]: M*M
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[113], line 1
----> 1 M*M
File ~\miniconda3\lib\site-packages\numpy\matrixlib\defmatrix.py:218, in matrix.__mul__(self, other)
215 def __mul__(self, other):
216 if isinstance(other, (N.ndarray, list, tuple)) :
217 # This promotes 1-D vectors to row vectors
--> 218 return N.dot(self, asmatrix(other))
219 if isscalar(other) or not hasattr(other, '__rmul__') :
220 return N.dot(self, other)
File <__array_function__ internals>:180, in dot(*args, **kwargs)
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
With a (1,3) shape, we have to use a transpose to do dot, producing (3,3) one way:
In [114]: M.T*M # (3,1) with (1,3) => (3,3)
Out[114]:
matrix([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
or (1,1) the other:
In [115]: M*M.T # (1,3) with (3,1) => (1,1)
Out[115]: matrix([[14]])
where as using the 1d A1 gives a scalar, dot product:
In [116]: M.A1 # M.A1
Out[116]: 14
While you are welcome to implement your own dot (for learning purpose, I assume), be aware that numpy has good code to do this for you.
np.dot
np.matmul
np.einsum
Note that dot and matmul extend the usual 2d matrix multiplication, with special rules for 1d arrays, and for 3+d. But the key pairing is "last dim A pairs with the 2nd to the last of B", the usual "across columns, down rows" we learned as kids.

Python - matrix multiplication

i have an array y with shape (n,), I want to compute the inner product matrix, which is a n * n matrix
However, when I tried to do it in Python
np.dot(y , y)
I got the answer n, this is not what I am looking for
I have also tried:
np.dot(np.transpose(y),y)
np.dot(y, np.transpose(y))
I always get the same answer n
I think you are looking for:
np.multiply.outer(y,y)
or equally:
y = y[None,:]
y.T#y
example:
y = np.array([1,2,3])[None,:]
output:
#[[1 2 3]
# [2 4 6]
# [3 6 9]]
You can try to reshape y from shape (70,) to (70,1) before multiplying the 2 matrices.
# Reshape
y = y.reshape(70,1)
# Either below code would work
y*y.T
np.matmul(y,y.T)
One-liner?
np.dot(a[:, None], a[None, :])
transpose doesn't work on 1-D arrays, because you need atleast two axes to 'swap' them. This solution adds a new axis to the array; in the first argument, it looks like a column vector and has two axes; in the second argument it still looks like a row vector but has two axes.
Looks like what you need is the # matrix multiplication operator. dot method is only to compute dot product between vectors, what you want is matrix multiplication.
>>> a = np.random.rand(70, 1)
>>> (a # a.T).shape
(70, 70)
UPDATE:
Above answer is incorrect. dot does the same things if the array is 2D. See the docs here.
np.dot computes the dot product of two arrays. Specifically,
If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a # b is preferred.
Simplest way to do what you want is to convert the vector to a matrix first using np.matrix and then using the #. Although, dot can also be used # is better because conventionally dot is used for vectors and # for matrices.
>>> a = np.random.rand(70)
(70,)
>>> a.shape
>>> a = np.matrix(a).T
>>> a.shape
(70, 1)
>>> (a # a.T).shape
(70, 70)

Why different shapes of array can have those following calculation? [duplicate]

I don't understand broadcasting. The documentation explains the rules of broadcasting but doesn't seem to define it in English. My guess is that broadcasting is when NumPy fills a smaller dimensional array with dummy data in order to perform an operation. But this doesn't work:
>>> x = np.array([1,3,5])
>>> y = np.array([2,4])
>>> x+y
*** ValueError: operands could not be broadcast together with shapes (3,) (2,)
The error message hints that I'm on the right track, though. Can someone define broadcasting and then provide some simple examples of when it works and when it doesn't?
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations.
It's basically a way numpy can expand the domain of operations over arrays.
The only requirement for broadcasting is a way aligning array dimensions such that either:
Aligned dimensions are equal.
One of the aligned dimensions is 1.
So, for example if:
x = np.ndarray(shape=(4,1,3))
y = np.ndarray(shape=(3,3))
You could not align x and y like so:
4 x 1 x 3
3 x 3
But you could like so:
4 x 1 x 3
3 x 3
How would an operation like this result?
Suppose we have:
x = np.ndarray(shape=(1,3), buffer=np.array([1,2,3]),dtype='int')
array([[1, 2, 3]])
y = np.ndarray(shape=(3,3), buffer=np.array([1,1,1,1,1,1,1,1,1]),dtype='int')
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
The operation x + y would result in:
array([[2, 3, 4],
[2, 3, 4],
[2, 3, 4]])
I hope you caught the drift. If you did not, you can always check the official documentation here.
Cheers!
1.What is Broadcasting?
Broadcasting is a Tensor operation. Helpful in Neural Network (ML, AI)
2.What is the use of Broadcasting?
Without Broadcasting addition of only identical Dimension(shape) Tensors is supported.
Broadcasting Provide us the Flexibility to add two Tensors of Different Dimension.
for Example: adding a 2D Tensor with a 1D Tensor is not possible without broadcasting see the image explaining Broadcasting pictorially
Run the Python example code understand the concept
x = np.array([1,3,5,6,7,8])
y = np.array([2,4,5])
X=x.reshape(2,3)
x is reshaped to get a 2D Tensor X of shape (2,3), and adding this 2D Tensor X with 1D Tensor y of shape(1,3) to get a 2D Tensor z of shape(2,3)
print("X =",X)
print("\n y =",y)
z=X+y
print("X + y =",z)
You are almost correct about smaller Tensor, no ambiguity, the smaller tensor will be broadcasted to match the shape of the larger tensor.(Small vector is repeated but not filled with Dummy Data or Zeros to Match the Shape of larger).
3. How broadcasting happens?
Broadcasting consists of two steps:
1 Broadcast axes are added to the smaller tensor to match the ndim of
the larger tensor.
2 The smaller tensor is repeated alongside these new axes to match the full shape
of the larger tensor.
4. Why Broadcasting not happening in your code?
your code is working but Broadcasting can not happen here because both Tensors are different in shape but Identical in Dimensional(1D).
Broadcasting occurs when dimensions are nonidentical.
what you need to do is change Dimension of one of the Tensor, you will experience Broadcasting.
5. Going in Depth.
Broadcasting(repetition of smaller Tensor) occurs along broadcast axes but since both the Tensors are 1 Dimensional there is no broadcast Axis.
Don't Confuse Tensor Dimension with the shape of tensor,
Tensor Dimensions are not same as Matrices Dimension.
Broadcasting is numpy trying to be smart when you tell it to perform an operation on arrays that aren't the same dimension. For example:
2 + np.array([1,3,5]) == np.array([3, 5, 7])
Here it decided you wanted to apply the operation using the lower dimensional array (0-D) on each item in the higher-dimensional array (1-D).
You can also add a 0-D array (scalar) or 1-D array to a 2-D array. In the first case, you just add the scalar to all items in the 2-D array, as before. In the second case, numpy will add row-wise:
In [34]: np.array([1,2]) + np.array([[3,4],[5,6]])
Out[34]:
array([[4, 6],
[6, 8]])
There are ways to tell numpy to apply the operation along a different axis as well. This can be taken even further with applying an operation between a 3-D array and a 1-D, 2-D, or 0-D array.
>>> x = np.array([1,3,5])
>>> y = np.array([2,4])
>>> x+y
*** ValueError: operands could not be broadcast together with shapes (3,) (2,)
Broadcasting is how numpy do math operations with array of different shapes. Shapes are the format the array has, for example the array you used, x , has 3 elements of 1 dimension; y has 2 elements and 1 dimension.
To perform broadcasting there are 2 rules:
1) Array have the same dimensions(shape) or
2)The dimension that doesn't match equals one.
for example x has shape(2,3) [or 2 lines and 3 columns];
y has shape(2,1) [or 2 lines and 1 column]
Can you add them? x + y?
Answer: Yes, because the mismatched dimension is equal to 1 (the column in y). If y had shape(2,4) broadcasting would not be possible, because the mismatched dimension is not 1.
In the case you posted:
operands could not be broadcast together with shapes (3,) (2,);
it is because 3 and 2 mismatched altough both have 1 line.
I would like to suggest to try the np.broadcast_arrays, run some demos may give intuitive ideas. Official Document is also helpful. From my current understanding, numpy will compare the dimension from tail to head. If one dim is 1, it will broadcast in the dimension, if one array has more axes, such (256*256*3) multiply (1,), you can view (1) as (1,1,1). And broadcast will make (256,256,3).

How does numpy dot product works [duplicate]

I try to run the code like below:
>>> import numpy as np
>>> A = np.array([[1,2], [3,4], [5,6]])
>>> A.shape
(3, 2)
>>> B = np.array([7,8])
>>> B.shape
(2,)
>>> np.dot(A,B)
array([23, 53, 83])
I thought the shape of np.dot(A,B) should be (1,3) not (3,).
The result of matrix return should be:
array([[23],[53],[83]])
23
53
83
not
array([23,53,83])
23 53 83
why the result occurred?
As its name suggests, the primary purpose of the numpy.dot() function is to deliver a scalar result by performing a traditional linear algebra dot product on two arrays of identical shape (m,).
Given this primary purpose, the documentation of numpy.dot() also talks about this scenario as the first (the first bullet point below):
numpy.dot(a, b, out=None)
1. If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
2. If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a # b is preferred.
3. If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.
4. If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
Your case is covered by the 4 th bullet point above (as pointed out by #hpaulj) in his comments.
But then, it still does not fully answer your question as to why the result has shape (3,), and not (3,1) as you expected.
You are justified in expecting a result-shape of (3,1), only if shape of B is (2,1). In such a case, since A has shape (3,2), and B has shape (2,1), you'd be justified in expecting a result-shape of (3,1).
But here, B has a shape of (2,), and not (2,1). So, we are now in a territory that's outside the jurisdiction of the usual rules of matrix multiplication. So, it's really up to the designers of the numpy.dot() function as to how the result turns out to be. They could've chosen to treat this as an error ("dimension mis-match"). Instead, they've chosen to deal with this scenario, as described in this answer.
I'm quoting that answer, with some modifications to relate your code:
According to numpy a 1D array has only 1 dimension and all checks
are done against that dimension. Because of this we find that np.dot(A,B)
checks second dimension of A against the one dimension of B
So, the check would succeed, and numpy wouldn't treat this as an error.
Now, the only remaining question is why is the result-shape (3,) and not (3,1) or (1,3).
The answer to this is: in A, which has shape (3,2), we have consumed the last part (2,) to perform sum-product. The un-consumed part of A's shape is (3,), and hence the shape of the result of np.dot(A,B), would be (3,). To understand this further, if we take a different example in which A has a shape of (3,4,2), instead of (3,2), the un-consumed part of A's shape would be (3,4,), and the result of np.dot(A,B) would be (3,4,) instead of (3,) which your example produced.
Here's the code for you to verify:
import numpy as np
A = np.arange(24).reshape(3,4,2)
print ("A is:\n", A, ", and its shape is:", A.shape)
B = np.array([7,8])
print ("B is:\n", B, ", and its shape is:", B.shape)
C = np.dot(A,B)
print ("C is:\n", C, ", and its shape is:", C.shape)
The output of this is:
A is:
[[[ 0 1]
[ 2 3]
[ 4 5]
[ 6 7]]
[[ 8 9]
[10 11]
[12 13]
[14 15]]
[[16 17]
[18 19]
[20 21]
[22 23]]] , and its shape is: (3, 4, 2)
B is:
[7 8] , and its shape is: (2,)
C is:
[[ 8 38 68 98]
[128 158 188 218]
[248 278 308 338]] , and its shape is: (3, 4)
Another helpful perspective to understand the behavior in this example is below:
The array A of shape (3,4,2) can be conceptually visualized as an outer array of inner arrays, where the outer array has shape (3,4), and each inner array has shape (2,). On each of these inner arrays, the traditional dot product will therefore be performed using the array B (which has shape (2,), and the resulting scalars are all left in their own respective places, to form a (3,4) shape (the outer matrix shape). So, the overall result of numpy.dot(A,B), consisting of all these in-place scalar results, would have the shape (3,4).
In wiki
So (3, 2) dot with (2,1) will be (3,1)
How to fix
np.dot(A,B[:,None])
Out[49]:
array([[23],
[53],
[83]])
I just learned this dot product from Neural Network...
Anyway, it is the dot product between "1d" array and "nd" array.
enter image description here
As we can see, it calculates the sum of the multiplication for elements separately in the red box as "17 + 28"
Then enter image description here
Then enter image description here
A.shape is (3, 2), B.shape is (2,) this situation could directly use the rule #4 for the dot operation np.dot(A,B):
If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
Because the alignment will happen between B's 2 (only axis of B) and A's 2 (last axis of A) and 2 indeed equals 2, numpy will judge that this is absolutely legitimate for dot operation. Therefore these two "2" are "consumed", leaving A's (3,) "in the wild". This (3,) will therefore be the shape of the result.

NumPy - What is broadcasting?

I don't understand broadcasting. The documentation explains the rules of broadcasting but doesn't seem to define it in English. My guess is that broadcasting is when NumPy fills a smaller dimensional array with dummy data in order to perform an operation. But this doesn't work:
>>> x = np.array([1,3,5])
>>> y = np.array([2,4])
>>> x+y
*** ValueError: operands could not be broadcast together with shapes (3,) (2,)
The error message hints that I'm on the right track, though. Can someone define broadcasting and then provide some simple examples of when it works and when it doesn't?
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations.
It's basically a way numpy can expand the domain of operations over arrays.
The only requirement for broadcasting is a way aligning array dimensions such that either:
Aligned dimensions are equal.
One of the aligned dimensions is 1.
So, for example if:
x = np.ndarray(shape=(4,1,3))
y = np.ndarray(shape=(3,3))
You could not align x and y like so:
4 x 1 x 3
3 x 3
But you could like so:
4 x 1 x 3
3 x 3
How would an operation like this result?
Suppose we have:
x = np.ndarray(shape=(1,3), buffer=np.array([1,2,3]),dtype='int')
array([[1, 2, 3]])
y = np.ndarray(shape=(3,3), buffer=np.array([1,1,1,1,1,1,1,1,1]),dtype='int')
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
The operation x + y would result in:
array([[2, 3, 4],
[2, 3, 4],
[2, 3, 4]])
I hope you caught the drift. If you did not, you can always check the official documentation here.
Cheers!
1.What is Broadcasting?
Broadcasting is a Tensor operation. Helpful in Neural Network (ML, AI)
2.What is the use of Broadcasting?
Without Broadcasting addition of only identical Dimension(shape) Tensors is supported.
Broadcasting Provide us the Flexibility to add two Tensors of Different Dimension.
for Example: adding a 2D Tensor with a 1D Tensor is not possible without broadcasting see the image explaining Broadcasting pictorially
Run the Python example code understand the concept
x = np.array([1,3,5,6,7,8])
y = np.array([2,4,5])
X=x.reshape(2,3)
x is reshaped to get a 2D Tensor X of shape (2,3), and adding this 2D Tensor X with 1D Tensor y of shape(1,3) to get a 2D Tensor z of shape(2,3)
print("X =",X)
print("\n y =",y)
z=X+y
print("X + y =",z)
You are almost correct about smaller Tensor, no ambiguity, the smaller tensor will be broadcasted to match the shape of the larger tensor.(Small vector is repeated but not filled with Dummy Data or Zeros to Match the Shape of larger).
3. How broadcasting happens?
Broadcasting consists of two steps:
1 Broadcast axes are added to the smaller tensor to match the ndim of
the larger tensor.
2 The smaller tensor is repeated alongside these new axes to match the full shape
of the larger tensor.
4. Why Broadcasting not happening in your code?
your code is working but Broadcasting can not happen here because both Tensors are different in shape but Identical in Dimensional(1D).
Broadcasting occurs when dimensions are nonidentical.
what you need to do is change Dimension of one of the Tensor, you will experience Broadcasting.
5. Going in Depth.
Broadcasting(repetition of smaller Tensor) occurs along broadcast axes but since both the Tensors are 1 Dimensional there is no broadcast Axis.
Don't Confuse Tensor Dimension with the shape of tensor,
Tensor Dimensions are not same as Matrices Dimension.
Broadcasting is numpy trying to be smart when you tell it to perform an operation on arrays that aren't the same dimension. For example:
2 + np.array([1,3,5]) == np.array([3, 5, 7])
Here it decided you wanted to apply the operation using the lower dimensional array (0-D) on each item in the higher-dimensional array (1-D).
You can also add a 0-D array (scalar) or 1-D array to a 2-D array. In the first case, you just add the scalar to all items in the 2-D array, as before. In the second case, numpy will add row-wise:
In [34]: np.array([1,2]) + np.array([[3,4],[5,6]])
Out[34]:
array([[4, 6],
[6, 8]])
There are ways to tell numpy to apply the operation along a different axis as well. This can be taken even further with applying an operation between a 3-D array and a 1-D, 2-D, or 0-D array.
>>> x = np.array([1,3,5])
>>> y = np.array([2,4])
>>> x+y
*** ValueError: operands could not be broadcast together with shapes (3,) (2,)
Broadcasting is how numpy do math operations with array of different shapes. Shapes are the format the array has, for example the array you used, x , has 3 elements of 1 dimension; y has 2 elements and 1 dimension.
To perform broadcasting there are 2 rules:
1) Array have the same dimensions(shape) or
2)The dimension that doesn't match equals one.
for example x has shape(2,3) [or 2 lines and 3 columns];
y has shape(2,1) [or 2 lines and 1 column]
Can you add them? x + y?
Answer: Yes, because the mismatched dimension is equal to 1 (the column in y). If y had shape(2,4) broadcasting would not be possible, because the mismatched dimension is not 1.
In the case you posted:
operands could not be broadcast together with shapes (3,) (2,);
it is because 3 and 2 mismatched altough both have 1 line.
I would like to suggest to try the np.broadcast_arrays, run some demos may give intuitive ideas. Official Document is also helpful. From my current understanding, numpy will compare the dimension from tail to head. If one dim is 1, it will broadcast in the dimension, if one array has more axes, such (256*256*3) multiply (1,), you can view (1) as (1,1,1). And broadcast will make (256,256,3).

Categories

Resources