This question already has answers here:
how does multiplication differ for NumPy Matrix vs Array classes?
(8 answers)
Closed 8 years ago.
I am learning NumPy and I am not really sure what is the operator * actually doing. It seems like some form of multiplication, but I am not sure how is it determined. From ipython:
In [1]: import numpy as np
In [2]: a=np.array([[1,2,3]])
In [3]: b=np.array([[4],[5],[6]])
In [4]: a*b
Out[4]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [5]: b*a
Out[5]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [6]: b.dot(a)
Out[6]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [7]: a.dot(b)
Out[7]: array([[32]])
It seems like it is doing matrix multiplication, but only b multiplied by a, not the other way around. What is going on?
It's a little bit complicated and has to do with the concept of broadcasting and the fact that all numpy operations are element wise.
a is a 2D array with 1 row and 3 columns and b is a 2D array with 1 column and 3 rows.
If you try to multiply them element by element (which is what numpy tries to do if you do a * b because every basic operation except the dot operation is element wise), it must broadcast the arrays so that they match in all their dimensions.
Since the first array is 1x3 and the second is 3x1 they can be broadcasted to 3x3 matrix according to the broadcasting rules. They will look like:
a = [[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]
b = [[4, 4, 4],
[5, 5, 5],
[6, 6, 6]]
And now Numpy can multiply them element by element, giving you the result:
[[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]]
When you are doing a .dot operation it does the standard matrix multiplication. More in docs
* does elementwise multiplication.
Since the arrays are of different shapes, broadcasting rules will be applied.
In [5]: a.shape
Out[5]: (1, 3)
In [6]: b.shape
Out[6]: (3, 1)
In [7]: (a * b).shape
Out[7]: (3, 3)
All input arrays with ndim smaller than the input array of largest ndim, have 1’s prepended to their shapes (does not apply here).
The size in each dimension of the output shape is the maximum of all the input sizes in that dimension.
An input can be used in the calculation if its size in a particular dimension either matches the output size in that dimension, or has value exactly 1.
If an input has a dimension size of 1 in its shape, the first data entry in that dimension will be used for all calculations along that dimension. In other words, the stepping machinery of the ufunc will simply not step along that dimension (the stride will be 0 for that dimension).
So, the resulting shape must be (3, 3) (maximums of a and b dimension sizes) and while performing the multiplication numpy will not step through a's first dimension and b's second dimension (their sizes are 1).
The result's [i][j] element is equal to the product of broadcasted a's and b's [i][j] element.
(a * b)[0][0] == a[0][0] * b[0][0]
(a * b)[0][1] == a[0][1] * b[0][0] # (not stepping through b's second dimension)
(a * b)[0][2] == a[0][2] * b[0][0]
(a * b)[1][0] == a[0][0] * b[1][0] # (not stepping through a's first dimension)
etc.
Related
I have an array f of shape (n,N) and an array w of shape (n,n). what is the fastest way to obtain an array fw of shape (n,n,N) whose elements are where and . I see that this is similar to np.tensordot over the axes 0 and 0 but without actually summing in the end and contracting the dimensions. Also I would like to be generalizable to any dimensions i.e. to start with A of shape (n_1,n_2,n_3,...) and B of shape (n_1,m_2,m_3,...) and obtain from it an array
AB of shape (n_1,n_2,..., m_2,m_3,...). I know one way would be to perform the outer product of A and B and then just select the elements where the index is the same but I don't believe it is the most optimal way.
For 3d arrays, you can use broadcasting to add another dimension to both arrays:
f = np.array([[1, 2, 3], [4, 5, 6]])
w = np.array([[7, 8], [9, 10]])
r = f[None, ...] * w[..., None]
print(r) # [[[ 7, 14, 21], [32, 40, 48]], [[ 9, 18, 27], [40, 50, 60]]]
print(r.shape) # (2, 2, 3)
I leave it to you to figure out if this works in the more general case and ask another question if not!
I want to split an 2D array this way:
Example.
From this 4x4 2D array:
np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
Create these four 2x2 2D arrays:
np.array([[1,2],[3,4]])
np.array([[5,6],[7,8]])
np.array([[9,10],[11,12]])
np.array([[13,14],[15,16]])
In a general case, from a NxN 2D array (square arrays) create 2D arrays of KxK shape, as many as possible.
Just to be more precise: to create the output array, not necessarily it will be made of all values from the row.
Example:
From a 2D 8x8 array, with values from 1 to 64, if I want to split this array in 2D 2x2 arrays, the first row from 8x8 array is a row from 1 to 8, and the first output 2D 2x2 array will be np.array([[1,2],[3,4]]), and the second output 2D 2x2 array will be np.array([[5,6],[7,8]])... It continues until the last output 2D array, that will be np.array([[61,62],[63,64]]). Look that each 2D 2x2 array was not filled with all the values from the row (CORRECT).
There is a Numpy method that do this?
You're probably looking for something like numpy.reshape.
In your example:
numpy.array([[1,2,3,4], [5,6,7,8]]).reshape(2,4)
>>>array([[1,2], [3,4], [5,6], [7,8]])
Or, as suggested by #MSeifert, using -1 as final dimension will let numpy do the division by itself:
numpy.array([[1,2,3,4], [5,6,7,8]]).reshape(2,-1)
>>>array([[1,2], [3,4], [5,6], [7,8]])
To get your desired output, you need to reshape to a 3D array and then unpack the first dimension:
>>> inp = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
>>> list(inp.reshape(-1, 2, 2))
[array([[1, 2],
[3, 4]]),
array([[5, 6],
[7, 8]]),
array([[ 9, 10],
[11, 12]]),
array([[13, 14],
[15, 16]])]
You can also unpack using = if you want to store the arrays in different variables instead of in one list of arrays:
>>> out1, out2, out3, out4 = inp.reshape(-1, 2, 2)
>>> out1
array([[1, 2],
[3, 4]])
If you're okay with a 3D array containing your 2D 2x2 arrays you don't need unpacking or the list() call:
>>> inp.reshape(-1, 2, 2)
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]],
[[13, 14],
[15, 16]]])
The -1 is a special value for reshape. As the documentation states:
One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
If you want it more general, just take the square root of the row-length and use that as argument for reshape:
>>> inp = np.ones((8, 8)) # 8x8 array
>>> square_shape = 2
>>> inp.reshape(-1, square_shape, square_shape) # 16 2x2 arrays
>>> square_shape = 4
>>> inp.reshape(-1, square_shape, square_shape) # 4 4x4 arrays
If you want to split it row wise, you may do np.reshape(arr,(2,2), order='C')
If you want to split it column wise, you may do not.reshape(arr,(2,2), order='F')
I have two numpy arrays:
a = np.array([1, 2, 3]).reshape(3, 1)
b = np.array([4, 5]).reshape(2,1)
When I use a*b.T, I am thinking a wrong output because there is a difference in their shapes (using * performs element-wise multiplication for an array).
But the result returns Matrix multiplication, like this:
[[ 4, 5],
[ 8, 10],
[12, 15]]
# this shape is (3, 2)
Why does it work like this?
Your a * b.T is element multiplication, and works because of broadcasting. Addition, and many other binary operations work with this pair of shapes.
a is (3,1). b.T is (1,2). Broadcasting combines (3,1) with (1,2) to produce (3,2). The size 1 dimension is adjusted to match the other non-zero dimension.
Unless you make arrays with np.matrix, * does not perform mathematical matrix multiplication. np.dot is used to perform that (# and np.einsum also do this).
With this particular combination of shapes, the dot product is the same. np.outer(a,b) also produces this, the mathematical outer product. np.dot matches the last dimension of a with the 2nd to the last dimension of b.T. In this case they are both 1. dot is more interesting when the shared dimension has multiple items, producing the familiar sum of products.
In [5]: np.dot(a, b.T)
Out[5]:
array([[ 4, 5],
[ 8, 10],
[12, 15]])
'outer' addition:
In [3]: a + b.T
Out[3]:
array([[5, 6],
[6, 7],
[7, 8]])
It may help to look at a and b like this:
In [7]: a
Out[7]:
array([[1],
[2],
[3]])
In [8]: b
Out[8]:
array([[4],
[5]])
In [9]: b.T
Out[9]: array([[4, 5]])
I generally don't use matrix to talk about numpy arrays unless they are created with np.matrix, or more frequently scipy.sparse. numpy arrays can be 0d, 1d, 2d and higher. I pay more attention to the shape than the names.
I am trying to write a function where its arguments are arrays with different shapes. I am having some troubles to understand column arrays and to make my function work for all shapes of arrays, here are the problems I found:
Transposing:
If the argument array A is not a vector then I can transpose it nicely using A.T however if A is a row vector this will NOT turn A into a column vector. If A is a column vector this will (strangely) turn it into a row vector. Is there a way to transpose an array independently of its shape?
Dot Product
The dot Product of a column vector with a scalar is a column vector (yeahh!). The dot Product of a column vector with a 1 element numpy array is a row vector (nayyy).
A = array((1,2)).reshape(2,1) #this is how I make a column vector (is there a better looking way?)
print dot(A,3) #column vector
b = dot(array((2,4)),a) #array with shape (1,)
print dot(A,b) #row vector..(bah)
Inversion
linalg.inv(array(2)) #gives an error, shouldn't it return 1/2 ?
Thanks for all the help!
P.S. Sorry for being noob I am used to Matlab this way of writing things is very confusing for me ..
P.S.2 I don't want to use matrices because arrays are more general
If you're used to Matlab, Numpy's way of dealing with "column" and "row" vectors is a little strange. The thing to realize is that a 1-d array is neither a column nor a row vector. To be a column or row vector, an array has to be a 2-d array with one dimension set to one. You can tell the difference between a 1-d array and a 2-d array with one row by looking at how many braces there are:
>>> a = numpy.arange(15)
>>> a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> b = a.reshape(1, -1)
>>> b
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]])
Now you can see that when you transpose these two, a stays the same, but b becomes a column vector:
>>> a.T
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> b.T
array([[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5],
[ 6],
[ 7],
[ 8],
[ 9],
[10],
[11],
[12],
[13],
[14]])
Again, this may seem a little strange -- but as you say, "arrays are more general." To achieve that generality, Numpy distinguishes strictly between arrays of different dimensions; a 1-d array simply can't be a "column" or "row" vector in any meaningful sense. The second dimension isn't defined at all!
The answers to your other questions follow from this observation. Your code example code above generates an error for me, so I'll do something slightly different... which also generates an error, but a more informative one:
>>> A
array([[1],
[2]])
>>> B
array([2, 4])
>>> numpy.dot(A, B)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: objects are not aligned
Numpy complains that the objects are not aligned. That's because B is a 1-d array! Let's make it a true row vector:
>>> B = B.reshape(1, -1)
>>> B
array([[2, 4]])
>>> numpy.dot(A, B)
array([[2, 4],
[4, 8]])
>>> numpy.dot(B, A)
array([[10]])
Now everything makes sense. Dot simply performs matrix multiplication here; in one order the operation produces a 2x2 array; in the other, it produces a 1x1 array. Note the number of braces! Both of these are 2-d arrays. In turn, 10, [10], and [[10]] would all be different results.
Similarly, consider these three values:
>>> numpy.array(2)
array(2)
>>> numpy.array((2,))
array([2])
>>> numpy.array((2,)).reshape(1,-1)
array([[2]])
If you pass these to numpy.linalg.inv, you'll get errors for all but the last -- you can't take the matrix inverse of something that isn't a matrix! If you pass the last, the result is also a matrix:
>>> numpy.linalg.inv(numpy.array((2,)).reshape(1,-1))
array([[ 0.5]])
Transposing
It is important to distinguish between 1D arrays and 2D arrays. The row vector you are referring to is 1D, while the column vector is 2D. To demonstrate the difference, have a look at the following example.
First we demonstrate the default behavior of transposing a 2D array (even the column vector is a simple 2D array):
import numpy as np
print np.ones((3, 4)).T.shape
print np.ones((3, 1)).T.shape
The output is - as expected:
(4, 3)
(1, 3)
A 1D vector, however, does not change its size:
print np.ones((3,)).T.shape
Output:
(3,)
To quickly convert it into a 2D array, use [:,None]:
print np.ones((3,))[:,None].T.shape
Output:
(1, 3)
Dot product
To obtain the desired result, you should better work with 2D arrays:
A = np.ones((2, 1) # column vector
b = np.ones((1, 1)) # scalar
print np.dot(A, b) # column vector (as expected)
Output:
[[ 1.]
[ 1.]]
Yeah! :)
Inversion
Again, you need to make sure to work with 2D arrays. This can be done using the ndmin argument:
print np.linalg.inv(np.array(2,ndmin=2))
Output:
[[ 0.5]]
How can I find the dimensions of a matrix in Python. Len(A) returns only one variable.
Edit:
close = dataobj.get_data(timestamps, symbols, closefield)
Is (I assume) generating a matrix of integers (less likely strings). I need to find the size of that matrix, so I can run some tests without having to iterate through all of the elements. As far as the data type goes, I assume it's an array of arrays (or list of lists).
The number of rows of a list of lists would be: len(A) and the number of columns len(A[0]) given that all rows have the same number of columns, i.e. all lists in each index are of the same size.
If you are using NumPy arrays, shape can be used.
For example
>>> a = numpy.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
>>> a
array([[[ 1, 2, 3],
[ 1, 2, 3]],
[[12, 3, 4],
[ 2, 1, 3]]])
>>> a.shape
(2, 2, 3)
As Ayman farhat mentioned
you can use the simple method len(matrix) to get the length of rows and get the length of the first row to get the no. of columns using len(matrix[0]) :
>>> a=[[1,5,6,8],[1,2,5,9],[7,5,6,2]]
>>> len(a)
3
>>> len(a[0])
4
Also you can use a library that helps you with matrices "numpy":
>>> import numpy
>>> numpy.shape(a)
(3,4)
To get just a correct number of dimensions in NumPy:
len(a.shape)
In the first case:
import numpy as np
a = np.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
print("shape = ",np.shape(a))
print("dimensions = ",len(a.shape))
The output will be:
shape = (2, 2, 3)
dimensions = 3
m = [[1, 1, 1, 0],[0, 5, 0, 1],[2, 1, 3, 10]]
print(len(m),len(m[0]))
Output
(3 4)
The correct answer is the following:
import numpy
numpy.shape(a)
Suppose you have a which is an array. to get the dimensions of an array you should use shape.
import numpy as np
a = np.array([[3,20,99],[-13,4.5,26],[0,-1,20],[5,78,-19]])
a.shape
The output of this will be
(4,3)
You may use as following to get Height and Weight of an Numpy array:
int height = arr.shape[0]
int width = arr.shape[1]
If your array has multiple dimensions, you can increase the index to access them.
You simply can find a matrix dimension by using Numpy:
import numpy as np
x = np.arange(24).reshape((6, 4))
x.ndim
output will be:
2
It means this matrix is a 2 dimensional matrix.
x.shape
Will show you the size of each dimension. The shape for x is equal to:
(6, 4)
A simple way I look at it:
example:
h=np.array([[[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]]])
h.ndim
4
h
array([[[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]]])
If you closely observe, the number of opening square brackets at the beginning is what defines the dimension of the array.
In the above array to access 7, the below indexing is used,
h[0,1,1,0]
However if we change the array to 3 dimensions as below,
h=np.array([[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]])
h.ndim
3
h
array([[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]])
To access element 7 in the above array, the index is h[1,1,0]