I have two numpy arrays:
a = np.array([1, 2, 3]).reshape(3, 1)
b = np.array([4, 5]).reshape(2,1)
When I use a*b.T, I am thinking a wrong output because there is a difference in their shapes (using * performs element-wise multiplication for an array).
But the result returns Matrix multiplication, like this:
[[ 4, 5],
[ 8, 10],
[12, 15]]
# this shape is (3, 2)
Why does it work like this?
Your a * b.T is element multiplication, and works because of broadcasting. Addition, and many other binary operations work with this pair of shapes.
a is (3,1). b.T is (1,2). Broadcasting combines (3,1) with (1,2) to produce (3,2). The size 1 dimension is adjusted to match the other non-zero dimension.
Unless you make arrays with np.matrix, * does not perform mathematical matrix multiplication. np.dot is used to perform that (# and np.einsum also do this).
With this particular combination of shapes, the dot product is the same. np.outer(a,b) also produces this, the mathematical outer product. np.dot matches the last dimension of a with the 2nd to the last dimension of b.T. In this case they are both 1. dot is more interesting when the shared dimension has multiple items, producing the familiar sum of products.
In [5]: np.dot(a, b.T)
Out[5]:
array([[ 4, 5],
[ 8, 10],
[12, 15]])
'outer' addition:
In [3]: a + b.T
Out[3]:
array([[5, 6],
[6, 7],
[7, 8]])
It may help to look at a and b like this:
In [7]: a
Out[7]:
array([[1],
[2],
[3]])
In [8]: b
Out[8]:
array([[4],
[5]])
In [9]: b.T
Out[9]: array([[4, 5]])
I generally don't use matrix to talk about numpy arrays unless they are created with np.matrix, or more frequently scipy.sparse. numpy arrays can be 0d, 1d, 2d and higher. I pay more attention to the shape than the names.
Related
Main Question:
Is it bad (perhaps in terms of computation time or memory) to using np.array([[1, 2, 3]]) everywhere instead of np.array([1, 2, 3])?
Motivation:
I come form a math background, so I like to think of things in terms of vectors and matrices. For example, I think of y = np.array([1, 2, 3]) as a row vector, or as a 1 X 3 matrix. However, numpy doesn't treat y quite like a 1 X 3 matrix. For instance, if we take a 2 X 3 matrix A = np.array([[1, 2, 3], [4, 5, 6]]) numpy will allow us to do the matrix multiplicationA # y even though the dimensions are (2 X 3) times (1 X 3) don't make sense mathematically.
On the other hand, y # A.T gives an error even though the dimensions (1 X 3) times (3 X 2) make sense.
So in conclusion np.array([1, 2, 3]) does not behave exactly as a matrix. However, from my experiments it seems that numpy does treat np.array([[1, 2, 3]]) as a bona fide 1 X 3 matrix. So if there are no downsides, I would prefer to use this version.
numpy has an old subclass np.matrix that makes sure everything has exactly 2 dimensions. But it is no longer recommended.
numpy tries to work equally well with 0,1,2, and more dimensions.
In [69]: A = np.array([[1,2,3],[4,5,6]])
In [70]: x = np.array([1,2,3])
In [71]: y = np.array([[1,2,3]])
In [72]: A.shape
Out[72]: (2, 3)
Matrix product of a (2,3) with (3,) resulting a (2,). It's docs say it expands the (3,) to (3,1), getting a (2,1) result, and then squeezing out that 1:
In [73]: A#x
Out[73]: array([14, 32])
The (2,3) with (1,3) transposed produces (2,1):
In [74]: A#y.T
Out[74]:
array([[14],
[32]])
(3,) with (3,2) => (2,):
In [78]: x#A.T
Out[78]: array([14, 32])
(1,3) with (3,2) => (1,3):
In [79]: y#A.T
Out[79]: array([[14, 32]])
How does your math intuition handle 3d or higher arrays? matmul/# handles them nicely. np.einsum does even better.
While you can create a (1,n) arrays, if it makes you more comfortable. But beware that you'll still end up with 1 or even 0d results.
For example with indexing:
In [80]: A
Out[80]:
array([[1, 2, 3],
[4, 5, 6]])
In [81]: A[1,:]
Out[81]: array([4, 5, 6])
In [82]: A[:,1]
Out[82]: array([2, 5])
In [83]: A[1,1]
Out[83]: 5
In [84]: A[1,1].shape
Out[84]: ()
In [85]: A[1,1].ndim
Out[85]: 0
or reduction along an axis:
In [86]: A.sum(axis=1)
Out[86]: array([ 6, 15])
though it's possible to retain dimensions:
In [87]: A.sum(axis=1, keepdims=True)
Out[87]:
array([[ 6],
[15]])
In [88]: A[[1],:]
Out[88]: array([[4, 5, 6]])
In [89]: A[:,[1]]
Out[89]:
array([[2],
[5]])
Another to keep in mind is that numpy operators most operation element-wise. The main exception being #. Where as in MATLAB A*B is matrix multiplication, and A.*B is element-wise. Add to that broadcasting, which allows us to add a (2,3) and (3,) array:
In [90]: A+x
Out[90]:
array([[2, 4, 6],
[5, 7, 9]])
Here (2,3) + (3,) => (2,3) + (1,3) => (2,3). The (3,) 1d array often behaves as a (1,3) or even (1,1,3) if needed. But expansion in the other direction has to be explicit.
In [92]: A / A.sum(axis=1) # (2,3) with (2,) error
Traceback (most recent call last):
File "<ipython-input-92-fec3395556f9>", line 1, in <module>
A / A.sum(axis=1)
ValueError: operands could not be broadcast together with shapes (2,3) (2,)
In [93]: A / A.sum(axis=1, keepdims=True) # (2,3) with (2,1) ok
Out[93]:
array([[0.16666667, 0.33333333, 0.5 ],
[0.26666667, 0.33333333, 0.4 ]])
In [94]: A / A.sum(axis=1)[:,None]
Out[94]:
array([[0.16666667, 0.33333333, 0.5 ],
[0.26666667, 0.33333333, 0.4 ]])
Supposing I have 2d and 1d numpy array. I want to add the second array to each subarray of the first one and to get a new 2d array as the result.
>>> import numpy as np
>>> a = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
>>> b = np.array([2, 3])
>>> c = ... # <-- What should be here?
>>> c
array([[3, 5],
[5, 7],
[7, 9],
[9, 22]])
I could use a loop but I think there're standard ways to do it within numpy.
What is the best and quickest way to do it? Performance matters.
Thanks.
I think the comments are missing the explanation of why a+b works. It's called broadcasting
Basically if you have a NxM matrix and a Nx1 vector, you can directly use the + operator to "add the vector to each row of the matrix.
This also works if you have a 1xM vector and want to add it columnwise.
Broadcasting also works with other operators and other Matrix dimensions.
Take a look at the documentation to fully understand broadcasting
I am so confused about Numpy array. Let's say I have two Numpy arrays.
a = np.array([[1,2], [3,4], [5,6]])
b = np.array([[1,10], [1, 10]])
My interpretations of a and b are 3x2 and 2x2 matrices, i.e,
a = 1 2 b = 1 10
3 4 1 10
5 6
Then, I thought it should be fine to do a * b since it is a multiplication of 3x2 and 2x2 matrices. However, it was not possible and I had to use a.dot(b).
Given this fact, I think my intepretation of Numpy array is not right. Can anyone let me know how I should think of Numpy array? I know that I can do a*b if I convert a and b into np.matrix. However, looking at other's code, it seems that people are just fine to use Numpy array as matrix, so I wonder how I should understand Numpy array in terms of matrix.
For numpy arrays, the * operator is used for element by element multiplication of arrays. This is only well defined if both arrays have the same dimensions. To illuminate *-multiplication, note that element by element multiplication with the identity matrix will not return the same matrix
>>> I = np.array([[1,0],[0,1]])
>>> B = np.array([[1,2],[3,4]])
>>> I*B
array([[ 1, 0],
[ 0, 4]])
Using the numpy function dot(a,b) produces the typical matrix multiplication.
>>> dot(I,B)
array([[ 1, 2],
[ 3, 4]])
np.dot is probably what you're looking for?
a = np.array([[1,2], [3,4], [5,6]])
b = np.array([[1,10], [1, 10]])
np.dot(a,b)
Out[6]:
array([[ 3, 30],
[ 7, 70],
[ 11, 110]])
How to convert (5,) numpy array to (5,1)?
And how to convert backwards from (5,1) to (5,)?
What is the purpose of (5,) array, why is one dimension omitted? I mean why we didn't always use (5,1) form?
Does this happen only with 1D and 2D arrays or does it happen across 3D arrays, like can (2,3,) array exist?
UPDATE:
I managed to convert from (5,) to (5,1) by
a= np.reshape(a, (a.shape[0], 1))
but suggested variant looks simpler:
a = a[:, None] or a = a[:, np.newaxis]
To convert from (5,1) to (5,) np.ravel can be used
a= np.ravel(a)
A numpy array with shape (5,) is a 1 dimensional array while one with shape (5,1) is a 2 dimensional array. The difference is subtle, but can alter some computations in a major way. One has to be specially careful since these changes can be bull-dozes over by operations which flatten all dimensions, like np.mean or np.sum.
In addition to #m-massias's answer, consider the following as an example:
17:00:25 [2]: import numpy as np
17:00:31 [3]: a = np.array([1,2])
17:00:34 [4]: b = np.array([[1,2], [3,4]])
17:00:45 [6]: b * a
Out[6]:
array([[1, 4],
[3, 8]])
17:00:50 [7]: b * a[:,None] # Different result!
Out[7]:
array([[1, 2],
[6, 8]])
a has shape (2,) and it is broadcast over the second dimension. So the result you get is that each row (the first dimension) is multiplied by the vector:
17:02:44 [10]: b * np.array([[1, 2], [1, 2]])
Out[10]:
array([[1, 4],
[3, 8]])
On the other hand, a[:,None] has the shape (2,1) and so the orientation of the vector is known to be a column. Hence, the result you get is from the following operation (where each column is multiplied by a):
17:03:39 [11]: b * np.array([[1, 1], [2, 2]])
Out[11]:
array([[1, 2],
[6, 8]])
I hope that sheds some light on how the two arrays will behave differently.
You can add a new axis to an array a by doing a = a[:, None] or a = a[:, np.newaxis]
As far as "one dimension omitted", I don't really understand your question, because it has no end : the array could be (5, 1, 1), etc.
Use reshape() function
e.g.
open python terminal and type following:
>>> import numpy as np
>>> a = np.random.random(5)
>>> a
array([0.85694461, 0.37774476, 0.56348081, 0.02972139, 0.23453958])
>>> a.shape
(5,)
>>> b = a.reshape(5, 1)
>>> b.shape
(5, 1)
This question already has answers here:
how does multiplication differ for NumPy Matrix vs Array classes?
(8 answers)
Closed 8 years ago.
I am learning NumPy and I am not really sure what is the operator * actually doing. It seems like some form of multiplication, but I am not sure how is it determined. From ipython:
In [1]: import numpy as np
In [2]: a=np.array([[1,2,3]])
In [3]: b=np.array([[4],[5],[6]])
In [4]: a*b
Out[4]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [5]: b*a
Out[5]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [6]: b.dot(a)
Out[6]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [7]: a.dot(b)
Out[7]: array([[32]])
It seems like it is doing matrix multiplication, but only b multiplied by a, not the other way around. What is going on?
It's a little bit complicated and has to do with the concept of broadcasting and the fact that all numpy operations are element wise.
a is a 2D array with 1 row and 3 columns and b is a 2D array with 1 column and 3 rows.
If you try to multiply them element by element (which is what numpy tries to do if you do a * b because every basic operation except the dot operation is element wise), it must broadcast the arrays so that they match in all their dimensions.
Since the first array is 1x3 and the second is 3x1 they can be broadcasted to 3x3 matrix according to the broadcasting rules. They will look like:
a = [[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]
b = [[4, 4, 4],
[5, 5, 5],
[6, 6, 6]]
And now Numpy can multiply them element by element, giving you the result:
[[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]]
When you are doing a .dot operation it does the standard matrix multiplication. More in docs
* does elementwise multiplication.
Since the arrays are of different shapes, broadcasting rules will be applied.
In [5]: a.shape
Out[5]: (1, 3)
In [6]: b.shape
Out[6]: (3, 1)
In [7]: (a * b).shape
Out[7]: (3, 3)
All input arrays with ndim smaller than the input array of largest ndim, have 1’s prepended to their shapes (does not apply here).
The size in each dimension of the output shape is the maximum of all the input sizes in that dimension.
An input can be used in the calculation if its size in a particular dimension either matches the output size in that dimension, or has value exactly 1.
If an input has a dimension size of 1 in its shape, the first data entry in that dimension will be used for all calculations along that dimension. In other words, the stepping machinery of the ufunc will simply not step along that dimension (the stride will be 0 for that dimension).
So, the resulting shape must be (3, 3) (maximums of a and b dimension sizes) and while performing the multiplication numpy will not step through a's first dimension and b's second dimension (their sizes are 1).
The result's [i][j] element is equal to the product of broadcasted a's and b's [i][j] element.
(a * b)[0][0] == a[0][0] * b[0][0]
(a * b)[0][1] == a[0][1] * b[0][0] # (not stepping through b's second dimension)
(a * b)[0][2] == a[0][2] * b[0][0]
(a * b)[1][0] == a[0][0] * b[1][0] # (not stepping through a's first dimension)
etc.