Main Question:
Is it bad (perhaps in terms of computation time or memory) to using np.array([[1, 2, 3]]) everywhere instead of np.array([1, 2, 3])?
Motivation:
I come form a math background, so I like to think of things in terms of vectors and matrices. For example, I think of y = np.array([1, 2, 3]) as a row vector, or as a 1 X 3 matrix. However, numpy doesn't treat y quite like a 1 X 3 matrix. For instance, if we take a 2 X 3 matrix A = np.array([[1, 2, 3], [4, 5, 6]]) numpy will allow us to do the matrix multiplicationA # y even though the dimensions are (2 X 3) times (1 X 3) don't make sense mathematically.
On the other hand, y # A.T gives an error even though the dimensions (1 X 3) times (3 X 2) make sense.
So in conclusion np.array([1, 2, 3]) does not behave exactly as a matrix. However, from my experiments it seems that numpy does treat np.array([[1, 2, 3]]) as a bona fide 1 X 3 matrix. So if there are no downsides, I would prefer to use this version.
numpy has an old subclass np.matrix that makes sure everything has exactly 2 dimensions. But it is no longer recommended.
numpy tries to work equally well with 0,1,2, and more dimensions.
In [69]: A = np.array([[1,2,3],[4,5,6]])
In [70]: x = np.array([1,2,3])
In [71]: y = np.array([[1,2,3]])
In [72]: A.shape
Out[72]: (2, 3)
Matrix product of a (2,3) with (3,) resulting a (2,). It's docs say it expands the (3,) to (3,1), getting a (2,1) result, and then squeezing out that 1:
In [73]: A#x
Out[73]: array([14, 32])
The (2,3) with (1,3) transposed produces (2,1):
In [74]: A#y.T
Out[74]:
array([[14],
[32]])
(3,) with (3,2) => (2,):
In [78]: x#A.T
Out[78]: array([14, 32])
(1,3) with (3,2) => (1,3):
In [79]: y#A.T
Out[79]: array([[14, 32]])
How does your math intuition handle 3d or higher arrays? matmul/# handles them nicely. np.einsum does even better.
While you can create a (1,n) arrays, if it makes you more comfortable. But beware that you'll still end up with 1 or even 0d results.
For example with indexing:
In [80]: A
Out[80]:
array([[1, 2, 3],
[4, 5, 6]])
In [81]: A[1,:]
Out[81]: array([4, 5, 6])
In [82]: A[:,1]
Out[82]: array([2, 5])
In [83]: A[1,1]
Out[83]: 5
In [84]: A[1,1].shape
Out[84]: ()
In [85]: A[1,1].ndim
Out[85]: 0
or reduction along an axis:
In [86]: A.sum(axis=1)
Out[86]: array([ 6, 15])
though it's possible to retain dimensions:
In [87]: A.sum(axis=1, keepdims=True)
Out[87]:
array([[ 6],
[15]])
In [88]: A[[1],:]
Out[88]: array([[4, 5, 6]])
In [89]: A[:,[1]]
Out[89]:
array([[2],
[5]])
Another to keep in mind is that numpy operators most operation element-wise. The main exception being #. Where as in MATLAB A*B is matrix multiplication, and A.*B is element-wise. Add to that broadcasting, which allows us to add a (2,3) and (3,) array:
In [90]: A+x
Out[90]:
array([[2, 4, 6],
[5, 7, 9]])
Here (2,3) + (3,) => (2,3) + (1,3) => (2,3). The (3,) 1d array often behaves as a (1,3) or even (1,1,3) if needed. But expansion in the other direction has to be explicit.
In [92]: A / A.sum(axis=1) # (2,3) with (2,) error
Traceback (most recent call last):
File "<ipython-input-92-fec3395556f9>", line 1, in <module>
A / A.sum(axis=1)
ValueError: operands could not be broadcast together with shapes (2,3) (2,)
In [93]: A / A.sum(axis=1, keepdims=True) # (2,3) with (2,1) ok
Out[93]:
array([[0.16666667, 0.33333333, 0.5 ],
[0.26666667, 0.33333333, 0.4 ]])
In [94]: A / A.sum(axis=1)[:,None]
Out[94]:
array([[0.16666667, 0.33333333, 0.5 ],
[0.26666667, 0.33333333, 0.4 ]])
Related
Is
a = [1, 2, 3]
x = numpy.array(a)
a matrix of 3 cols and 1 row? I know that x = numpy.array([a]) is a 1x4 matrix but i need the opossite.
I need to multiply two matrix but the first one is a list inserted into a numpy.array(a)
have not found the doc for a way to do a for and cicle throught a to add it to x.
Edit: I am working on linear regression so i need a nrows x 1 col, my original data is in a list and am using numpy dot() funtion to multiply and i need to transform my list int a matrix nrowsx 1 column.
Fixed the solution was to transpose x = numpy.array([a]) with x = x.transpose() and that gives me a nx1 matrix.
Thanks for the help given you helped me think.
It is a 1 dimensional array:
In [653]: x = np.array([1,2,3])
In [654]: x
Out[654]: array([1, 2, 3])
In [655]: x.shape
Out[655]: (3,)
In [656]: x.ndim
Out[656]: 1
The other is 2 dimensional:
In [657]: y = np.array([[1,2,3]])
In [658]: y
Out[658]: array([[1, 2, 3]])
In [659]: y.shape
Out[659]: (1, 3)
In [660]: y.ndim
Out[660]: 2
the transpose of y
In [661]: z = y.T
In [662]: z
Out[662]:
array([[1],
[2],
[3]])
In [663]: z.shape
Out[663]: (3, 1)
The transpose of x is the same as x
Some multiply options:
In [664]: np.dot(x,y)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-664-6849a5f7ad6c> in <module>()
----> 1 np.dot(x,y)
ValueError: shapes (3,) and (1,3) not aligned: 3 (dim 0) != 1 (dim 0)
Read np.dot for rules about the interaction of shapes. The key phrase is 'last dimension of x pairs the 2nd to the last of y'.
In [665]: np.dot(y,x)
Out[665]: array([14])
Here the (1,3) pairs with (3,) t- produce a (1,).
Element wise multiplication. Here broadcasting rules apply
In [666]: x*y
Out[666]: array([[1, 4, 9]])
(3,) with (1,3) -> (1,3)(1,3) -> (1,3)
In [667]: x*z
Out[667]:
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
(3,) with (3,1) -> (1,3)(3,1) -> (3,3)
A handy way of changing the (3,) array into a (3,1) is with None (np.newaxis):
In [671]: x[:,None]
Out[671]:
array([[1],
[2],
[3]])
In [672]: np.dot(x[:,None],y)
Out[672]:
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
(3,1) dot with (1,3) -> (3,3)
I have two numpy arrays:
a = np.array([1, 2, 3]).reshape(3, 1)
b = np.array([4, 5]).reshape(2,1)
When I use a*b.T, I am thinking a wrong output because there is a difference in their shapes (using * performs element-wise multiplication for an array).
But the result returns Matrix multiplication, like this:
[[ 4, 5],
[ 8, 10],
[12, 15]]
# this shape is (3, 2)
Why does it work like this?
Your a * b.T is element multiplication, and works because of broadcasting. Addition, and many other binary operations work with this pair of shapes.
a is (3,1). b.T is (1,2). Broadcasting combines (3,1) with (1,2) to produce (3,2). The size 1 dimension is adjusted to match the other non-zero dimension.
Unless you make arrays with np.matrix, * does not perform mathematical matrix multiplication. np.dot is used to perform that (# and np.einsum also do this).
With this particular combination of shapes, the dot product is the same. np.outer(a,b) also produces this, the mathematical outer product. np.dot matches the last dimension of a with the 2nd to the last dimension of b.T. In this case they are both 1. dot is more interesting when the shared dimension has multiple items, producing the familiar sum of products.
In [5]: np.dot(a, b.T)
Out[5]:
array([[ 4, 5],
[ 8, 10],
[12, 15]])
'outer' addition:
In [3]: a + b.T
Out[3]:
array([[5, 6],
[6, 7],
[7, 8]])
It may help to look at a and b like this:
In [7]: a
Out[7]:
array([[1],
[2],
[3]])
In [8]: b
Out[8]:
array([[4],
[5]])
In [9]: b.T
Out[9]: array([[4, 5]])
I generally don't use matrix to talk about numpy arrays unless they are created with np.matrix, or more frequently scipy.sparse. numpy arrays can be 0d, 1d, 2d and higher. I pay more attention to the shape than the names.
Let's say I have a row vector of the shape (1, 256). I want to transform it into a column vector of the shape (256, 1) instead. How would you do it in Numpy?
you can use the transpose operation to do this:
Example:
In [2]: a = np.array([[1,2], [3,4], [5,6]])
In [5]: a.shape
Out[5]: (3, 2)
In [6]: a_trans = a.T #or: np.transpose(a), a.transpose()
In [8]: a_trans.shape
Out[8]: (2, 3)
In [7]: a_trans
Out[7]:
array([[1, 3, 5],
[2, 4, 6]])
Note that the original array a will still remain unmodified. The transpose operation will just make a copy and transpose it.
If your input array is rather 1D, then you can promote the array to a column vector by introducing a new (singleton) axis as the second dimension. Below is an example:
# 1D array
In [13]: arr = np.arange(6)
# promotion to a column vector (i.e., a 2D array)
In [14]: arr = arr[..., None] #or: arr = arr[:, np.newaxis]
In [15]: arr
Out[15]:
array([[0],
[1],
[2],
[3],
[4],
[5]])
In [12]: arr.shape
Out[12]: (6, 1)
For the 1D case, yet another option would be to use numpy.atleast_2d() followed by a transpose operation, as suggested by ankostis in the comments.
In [9]: np.atleast_2d(arr).T
Out[9]:
array([[0],
[1],
[2],
[3],
[4],
[5]])
We can simply use the reshape functionality of numpy:
a=np.array([[1,2,3,4]])
a:
array([[1, 2, 3, 4]])
a.shape
(1,4)
b=a.reshape(-1,1)
b:
array([[1],
[2],
[3],
[4]])
b.shape
(4,1)
Some of the ways I have compiled to do this are:
>>> import numpy as np
>>> a = np.array([1, 2, 3], [2, 4, 5])
>>> a
array([[1, 2],
[2, 4],
[3, 5]])
Another way to do it:
>>> a.T
array([[1, 2],
[2, 4],
[3, 5]])
Another way to do this will be:
>>> a.reshape(a.shape[1], a.shape[0])
array([[1, 2],
[3, 2],
[4, 5]])
I have used a 2-dimensional array in all of these problems, the real problem arises when there is a 1-dimensional row vector which you want to columnize elegantly.
Numpy's reshape has a functionality where you pass the one of the dimension (number of rows or number of columns) you want, numpy can figure out the other dimension by itself if you pass the other dimension as -1
>>> a.reshape(-1, 1)
array([[1],
[2],
[3],
[2],
[4],
[5]])
>>> a = np.array([1, 2, 3])
>>> a.reshape(-1, 1)
array([[1],
[2],
[3]])
>>> a.reshape(2, -1)
...
ValueError: cannot reshape array of size 3 into shape (2,newaxis)
So, you can give your choice of 1-dimension without worrying about the other dimension as long as (m * n) / your_choice is an integer.
If you want to know more about this -1, head over to:
What does -1 mean in numpy reshape?
Note: All these operations return a new array and do not modify the original array.
You can use reshape() method of numpy object.
To transform any row vector to column vector, use
array.reshape(-1, 1)
To convert any column vector to row vector, use
array.reshape(1, -1)
reshape() is used to change the shape of the matrix.
So if you want to create a 2x2 matrix you can call the method like a.reshape(2, 2).
So why this -1 in the answer?
If you dont want to explicitly specify one dimension(or unknown dimension) and wants numpy to find the value for you, you can pass -1 to that dimension. So numpy will automatically calculate the the value for you from the ramaining dimensions. Keep in mind that you can not pass -1 to more than one dimension.
Thus in the first case(array.reshape(-1, 1)) the second dimension(column) is one(1) and the first(row) is unknown(-1). So numpy will figure out how to represent a 1-by-4 to x-by-1 and finds the x for you.
An alternative solutions with reshape method will be a.reshape(a.shape[1], a.shape[0]). Here you are explicitly specifying the diemsions.
Using np.newaxis can be a bit counterintuitive. But it is possible.
>>> a = np.array([1,2,3])
>>> a.shape
(3,)
>>> a[:,np.newaxis].shape
(3, 1)
>>> a[:,None]
array([[1],
[2],
[3]])
np.newaxis is equal to None internally. So you can use None.
But it is not recommended because it impairs readability
To convert a row vector into a column vector in Python can be important e.g. to use broadcasting:
import numpy as np
def colvec(rowvec):
v = np.asarray(rowvec)
return v.reshape(v.size,1)
colvec([1,2,3]) * [[1,2,3], [4,5,6], [7,8,9]]
Multiplies the first row by 1, the second row by 2 and the third row by 3:
array([[ 1, 2, 3],
[ 8, 10, 12],
[ 21, 24, 27]])
In contrast, trying to use a column vector typed as matrix:
np.asmatrix([1, 2, 3]).transpose() * [[1,2,3], [4,5,6], [7,8,9]]
fails with error ValueError: shapes (3,1) and (3,3) not aligned: 1 (dim 1) != 3 (dim 0).
How to convert (5,) numpy array to (5,1)?
And how to convert backwards from (5,1) to (5,)?
What is the purpose of (5,) array, why is one dimension omitted? I mean why we didn't always use (5,1) form?
Does this happen only with 1D and 2D arrays or does it happen across 3D arrays, like can (2,3,) array exist?
UPDATE:
I managed to convert from (5,) to (5,1) by
a= np.reshape(a, (a.shape[0], 1))
but suggested variant looks simpler:
a = a[:, None] or a = a[:, np.newaxis]
To convert from (5,1) to (5,) np.ravel can be used
a= np.ravel(a)
A numpy array with shape (5,) is a 1 dimensional array while one with shape (5,1) is a 2 dimensional array. The difference is subtle, but can alter some computations in a major way. One has to be specially careful since these changes can be bull-dozes over by operations which flatten all dimensions, like np.mean or np.sum.
In addition to #m-massias's answer, consider the following as an example:
17:00:25 [2]: import numpy as np
17:00:31 [3]: a = np.array([1,2])
17:00:34 [4]: b = np.array([[1,2], [3,4]])
17:00:45 [6]: b * a
Out[6]:
array([[1, 4],
[3, 8]])
17:00:50 [7]: b * a[:,None] # Different result!
Out[7]:
array([[1, 2],
[6, 8]])
a has shape (2,) and it is broadcast over the second dimension. So the result you get is that each row (the first dimension) is multiplied by the vector:
17:02:44 [10]: b * np.array([[1, 2], [1, 2]])
Out[10]:
array([[1, 4],
[3, 8]])
On the other hand, a[:,None] has the shape (2,1) and so the orientation of the vector is known to be a column. Hence, the result you get is from the following operation (where each column is multiplied by a):
17:03:39 [11]: b * np.array([[1, 1], [2, 2]])
Out[11]:
array([[1, 2],
[6, 8]])
I hope that sheds some light on how the two arrays will behave differently.
You can add a new axis to an array a by doing a = a[:, None] or a = a[:, np.newaxis]
As far as "one dimension omitted", I don't really understand your question, because it has no end : the array could be (5, 1, 1), etc.
Use reshape() function
e.g.
open python terminal and type following:
>>> import numpy as np
>>> a = np.random.random(5)
>>> a
array([0.85694461, 0.37774476, 0.56348081, 0.02972139, 0.23453958])
>>> a.shape
(5,)
>>> b = a.reshape(5, 1)
>>> b.shape
(5, 1)
I have what seems to be an easy question.
Observe the code:
In : x=np.array([0, 6])
Out: array([0, 6])
In : x.shape
Out: (2L,)
Which shows that the array has no second dimension, and therefore x is no differnet from x.T.
How can I make x have dimension (2L,1L)? The real motivation for this question is that I have an array y of shape [3L,4L], and I want y.sum(1) to be a vector that can be transposed, etc.
While you can reshape arrays, and add dimensions with [:,np.newaxis], you should be familiar with the most basic nested brackets, or list, notation. Note how it matches the display.
In [230]: np.array([[0],[6]])
Out[230]:
array([[0],
[6]])
In [231]: _.shape
Out[231]: (2, 1)
np.array also takes a ndmin parameter, though it add extra dimensions at the start (the default location for numpy.)
In [232]: np.array([0,6],ndmin=2)
Out[232]: array([[0, 6]])
In [233]: _.shape
Out[233]: (1, 2)
A classic way of making something 2d - reshape:
In [234]: y=np.arange(12).reshape(3,4)
In [235]: y
Out[235]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
sum (and related functions) has a keepdims parameter. Read the docs.
In [236]: y.sum(axis=1,keepdims=True)
Out[236]:
array([[ 6],
[22],
[38]])
In [237]: _.shape
Out[237]: (3, 1)
empty 2nd dimension isn't quite the terminology. More like a nonexistent 2nd dimension.
A dimension can have 0 terms:
In [238]: np.ones((2,0))
Out[238]: array([], shape=(2, 0), dtype=float64)
If you are more familiar with MATLAB, which has a minimum of 2d, you might like the np.matrix subclass. It takes steps to ensure that most operations return another 2d matrix:
In [247]: ym=np.matrix(y)
In [248]: ym.sum(axis=1)
Out[248]:
matrix([[ 6],
[22],
[38]])
The matrix sum does:
np.ndarray.sum(self, axis, dtype, out, keepdims=True)._collapse(axis)
The _collapse bit lets it return a scalar for ym.sum().
There is another point to keep dimension info:
In [42]: X
Out[42]:
array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
In [43]: X[1].shape
Out[43]: (2,)
In [44]: X[1:2].shape
Out[44]: (1, 2)
In [45]: X[1]
Out[45]: array([0, 1])
In [46]: X[1:2] # this way will keep dimension
Out[46]: array([[0, 1]])