numpy column arrays and strange results - python

I am trying to write a function where its arguments are arrays with different shapes. I am having some troubles to understand column arrays and to make my function work for all shapes of arrays, here are the problems I found:
Transposing:
If the argument array A is not a vector then I can transpose it nicely using A.T however if A is a row vector this will NOT turn A into a column vector. If A is a column vector this will (strangely) turn it into a row vector. Is there a way to transpose an array independently of its shape?
Dot Product
The dot Product of a column vector with a scalar is a column vector (yeahh!). The dot Product of a column vector with a 1 element numpy array is a row vector (nayyy).
A = array((1,2)).reshape(2,1) #this is how I make a column vector (is there a better looking way?)
print dot(A,3) #column vector
b = dot(array((2,4)),a) #array with shape (1,)
print dot(A,b) #row vector..(bah)
Inversion
linalg.inv(array(2)) #gives an error, shouldn't it return 1/2 ?
Thanks for all the help!
P.S. Sorry for being noob I am used to Matlab this way of writing things is very confusing for me ..
P.S.2 I don't want to use matrices because arrays are more general

If you're used to Matlab, Numpy's way of dealing with "column" and "row" vectors is a little strange. The thing to realize is that a 1-d array is neither a column nor a row vector. To be a column or row vector, an array has to be a 2-d array with one dimension set to one. You can tell the difference between a 1-d array and a 2-d array with one row by looking at how many braces there are:
>>> a = numpy.arange(15)
>>> a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> b = a.reshape(1, -1)
>>> b
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]])
Now you can see that when you transpose these two, a stays the same, but b becomes a column vector:
>>> a.T
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> b.T
array([[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5],
[ 6],
[ 7],
[ 8],
[ 9],
[10],
[11],
[12],
[13],
[14]])
Again, this may seem a little strange -- but as you say, "arrays are more general." To achieve that generality, Numpy distinguishes strictly between arrays of different dimensions; a 1-d array simply can't be a "column" or "row" vector in any meaningful sense. The second dimension isn't defined at all!
The answers to your other questions follow from this observation. Your code example code above generates an error for me, so I'll do something slightly different... which also generates an error, but a more informative one:
>>> A
array([[1],
[2]])
>>> B
array([2, 4])
>>> numpy.dot(A, B)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: objects are not aligned
Numpy complains that the objects are not aligned. That's because B is a 1-d array! Let's make it a true row vector:
>>> B = B.reshape(1, -1)
>>> B
array([[2, 4]])
>>> numpy.dot(A, B)
array([[2, 4],
[4, 8]])
>>> numpy.dot(B, A)
array([[10]])
Now everything makes sense. Dot simply performs matrix multiplication here; in one order the operation produces a 2x2 array; in the other, it produces a 1x1 array. Note the number of braces! Both of these are 2-d arrays. In turn, 10, [10], and [[10]] would all be different results.
Similarly, consider these three values:
>>> numpy.array(2)
array(2)
>>> numpy.array((2,))
array([2])
>>> numpy.array((2,)).reshape(1,-1)
array([[2]])
If you pass these to numpy.linalg.inv, you'll get errors for all but the last -- you can't take the matrix inverse of something that isn't a matrix! If you pass the last, the result is also a matrix:
>>> numpy.linalg.inv(numpy.array((2,)).reshape(1,-1))
array([[ 0.5]])

Transposing
It is important to distinguish between 1D arrays and 2D arrays. The row vector you are referring to is 1D, while the column vector is 2D. To demonstrate the difference, have a look at the following example.
First we demonstrate the default behavior of transposing a 2D array (even the column vector is a simple 2D array):
import numpy as np
print np.ones((3, 4)).T.shape
print np.ones((3, 1)).T.shape
The output is - as expected:
(4, 3)
(1, 3)
A 1D vector, however, does not change its size:
print np.ones((3,)).T.shape
Output:
(3,)
To quickly convert it into a 2D array, use [:,None]:
print np.ones((3,))[:,None].T.shape
Output:
(1, 3)
Dot product
To obtain the desired result, you should better work with 2D arrays:
A = np.ones((2, 1) # column vector
b = np.ones((1, 1)) # scalar
print np.dot(A, b) # column vector (as expected)
Output:
[[ 1.]
[ 1.]]
Yeah! :)
Inversion
Again, you need to make sure to work with 2D arrays. This can be done using the ndmin argument:
print np.linalg.inv(np.array(2,ndmin=2))
Output:
[[ 0.5]]

Related

Why does the shape remains same when I sum a square numpy array along either directions?

I was expecting the shape to be (1,3) when I sum along axis=0 i.e. rows. But the shape remains same in both cases. Why is that?
>>> arr = np.arange(9).reshape(3,3)
>>> arr
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> arr.sum(1)
array([ 3, 12, 21])
>>> arr.sum(1).shape
(3,)
>>> arr.sum(0)
array([ 9, 12, 15])
>>> arr.sum(0).shape
(3,)
numpy.sum returns:
An array with the same shape as a, with the specified axis removed.
With one axis removed in both cases, you are left with a singleton tuple.
2 axes - 1 specified axis = 1 axis
However, passing keepdims as True in both gives different shapes, retaining all the axes in the original array with a corresponding change of length along the specified axis:
>>> arr.sum(axis=0, keepdims=True)
array([[ 9, 12, 15]])
>>> arr.sum(axis=1, keepdims=True)
array([[ 3],
[12],
[21]])
Because summing along the axis of a ND array yields a (N-1)D array. This makes sense if you consider that
np.sum([1,2,3]) == 6 # a 0D 'array'
If you want to turn your arr.sum(1) into a (1, 3) or (3, 1) 2D array, then use
s = arr.sum(0)[np.newaxis, :] # (1, 3)
or
s = arr.sum(1)[:, np.newaxis] # (3, 1)
According to the documentation this is what you'll get:
Returns:
sum_along_axis : ndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, a scalar is returned. If an output array is specified, a reference to out is returned.
The shape of arr is indeed (3,3) and is two-dimensional. If you remove one axis you'll be left with a shape of (3,) - which is one-dimensional.
An array with shape (1,3) still has two axes.
numpy.arrays have a logic which is not the same than Matlab or even mathematics. From here :
Handling of vectors (one-dimensional arrays) For array, the vector
shapes 1xN, Nx1, and N are all different things. Operations like
A[:,1] return a one-dimensional array of shape N, not a
two-dimensional array of shape Nx1. Transpose on a one-dimensional
array does nothing.
Numpy story began not with linear algebra, so a one dimension object is always horizontal, cannot be transposed, an so on. It is confusing first time with a different background, but with a lot advantages in other fields. in numpy
2-dim arrays are lines (dim0) of columns(dim1), like for matrix, but selecting a line or a column return always ... a line !
As an example :
In [1]: m=np.arange(6).reshape(3,2)
In [2]: m
Out[2]:
array([[0, 1],
[2, 3],
[4, 5]])
In [3]: m[0,:]
Out[3]: array([0, 1])
In [4]: m[:,0]
Out[4]: array([0, 2, 4])
This convention accepted, nothing is very difficult.

Python Numpy syntax: what does array index as two arrays separated by comma mean?

I don't understand array as index in Python Numpy.
For example, I have a 2d array A in Numpy
[[1,2,3]
[4,5,6]
[7,8,9]
[10,11,12]]
What does A[[1,3], [0,1]] mean?
Just test it for yourself!
A = np.arange(12).reshape(4,3)
print(A)
>>> array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
By slicing the array the way you did (docs to slicing), you'll get the first row, zero-th column element and the third row, first column element.
A[[1,3], [0,1]]
>>> array([ 3, 10])
I'd highly encourage you to play around with that a bit and have a look at the documentation and the examples.
Your are creating a new array:
import numpy as np
A = [[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]]
A = np.array(A)
print(A[[1, 3], [0, 1]])
# [ 4 11]
See Indexing, Slicing and Iterating in the tutorial.
Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by commas
Quoting the doc:
def f(x,y):
return 10*x+y
b = np.fromfunction(f, (5, 4), dtype=int)
print(b[2, 3])
# -> 23
You can also use a NumPy array as index of an array. See Index arrays in the doc.
NumPy arrays may be indexed with other arrays (or any other sequence- like object that can be converted to an array, such as lists, with the exception of tuples; see the end of this document for why this is). The use of index arrays ranges from simple, straightforward cases to complex, hard-to-understand cases. For all cases of index arrays, what is returned is a copy of the original data, not a view as one gets for slices.

How to split an 2D array, creating arrays from "row to row" values

I want to split an 2D array this way:
Example.
From this 4x4 2D array:
np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
Create these four 2x2 2D arrays:
np.array([[1,2],[3,4]])
np.array([[5,6],[7,8]])
np.array([[9,10],[11,12]])
np.array([[13,14],[15,16]])
In a general case, from a NxN 2D array (square arrays) create 2D arrays of KxK shape, as many as possible.
Just to be more precise: to create the output array, not necessarily it will be made of all values from the row.
Example:
From a 2D 8x8 array, with values from 1 to 64, if I want to split this array in 2D 2x2 arrays, the first row from 8x8 array is a row from 1 to 8, and the first output 2D 2x2 array will be np.array([[1,2],[3,4]]), and the second output 2D 2x2 array will be np.array([[5,6],[7,8]])... It continues until the last output 2D array, that will be np.array([[61,62],[63,64]]). Look that each 2D 2x2 array was not filled with all the values from the row (CORRECT).
There is a Numpy method that do this?
You're probably looking for something like numpy.reshape.
In your example:
numpy.array([[1,2,3,4], [5,6,7,8]]).reshape(2,4)
>>>array([[1,2], [3,4], [5,6], [7,8]])
Or, as suggested by #MSeifert, using -1 as final dimension will let numpy do the division by itself:
numpy.array([[1,2,3,4], [5,6,7,8]]).reshape(2,-1)
>>>array([[1,2], [3,4], [5,6], [7,8]])
To get your desired output, you need to reshape to a 3D array and then unpack the first dimension:
>>> inp = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
>>> list(inp.reshape(-1, 2, 2))
[array([[1, 2],
[3, 4]]),
array([[5, 6],
[7, 8]]),
array([[ 9, 10],
[11, 12]]),
array([[13, 14],
[15, 16]])]
You can also unpack using = if you want to store the arrays in different variables instead of in one list of arrays:
>>> out1, out2, out3, out4 = inp.reshape(-1, 2, 2)
>>> out1
array([[1, 2],
[3, 4]])
If you're okay with a 3D array containing your 2D 2x2 arrays you don't need unpacking or the list() call:
>>> inp.reshape(-1, 2, 2)
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]],
[[13, 14],
[15, 16]]])
The -1 is a special value for reshape. As the documentation states:
One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
If you want it more general, just take the square root of the row-length and use that as argument for reshape:
>>> inp = np.ones((8, 8)) # 8x8 array
>>> square_shape = 2
>>> inp.reshape(-1, square_shape, square_shape) # 16 2x2 arrays
>>> square_shape = 4
>>> inp.reshape(-1, square_shape, square_shape) # 4 4x4 arrays
If you want to split it row wise, you may do np.reshape(arr,(2,2), order='C')
If you want to split it column wise, you may do not.reshape(arr,(2,2), order='F')

What is the multiplication operator actually doing with numpy arrays? [duplicate]

This question already has answers here:
how does multiplication differ for NumPy Matrix vs Array classes?
(8 answers)
Closed 8 years ago.
I am learning NumPy and I am not really sure what is the operator * actually doing. It seems like some form of multiplication, but I am not sure how is it determined. From ipython:
In [1]: import numpy as np
In [2]: a=np.array([[1,2,3]])
In [3]: b=np.array([[4],[5],[6]])
In [4]: a*b
Out[4]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [5]: b*a
Out[5]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [6]: b.dot(a)
Out[6]:
array([[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]])
In [7]: a.dot(b)
Out[7]: array([[32]])
It seems like it is doing matrix multiplication, but only b multiplied by a, not the other way around. What is going on?
It's a little bit complicated and has to do with the concept of broadcasting and the fact that all numpy operations are element wise.
a is a 2D array with 1 row and 3 columns and b is a 2D array with 1 column and 3 rows.
If you try to multiply them element by element (which is what numpy tries to do if you do a * b because every basic operation except the dot operation is element wise), it must broadcast the arrays so that they match in all their dimensions.
Since the first array is 1x3 and the second is 3x1 they can be broadcasted to 3x3 matrix according to the broadcasting rules. They will look like:
a = [[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]
b = [[4, 4, 4],
[5, 5, 5],
[6, 6, 6]]
And now Numpy can multiply them element by element, giving you the result:
[[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18]]
When you are doing a .dot operation it does the standard matrix multiplication. More in docs
* does elementwise multiplication.
Since the arrays are of different shapes, broadcasting rules will be applied.
In [5]: a.shape
Out[5]: (1, 3)
In [6]: b.shape
Out[6]: (3, 1)
In [7]: (a * b).shape
Out[7]: (3, 3)
All input arrays with ndim smaller than the input array of largest ndim, have 1’s prepended to their shapes (does not apply here).
The size in each dimension of the output shape is the maximum of all the input sizes in that dimension.
An input can be used in the calculation if its size in a particular dimension either matches the output size in that dimension, or has value exactly 1.
If an input has a dimension size of 1 in its shape, the first data entry in that dimension will be used for all calculations along that dimension. In other words, the stepping machinery of the ufunc will simply not step along that dimension (the stride will be 0 for that dimension).
So, the resulting shape must be (3, 3) (maximums of a and b dimension sizes) and while performing the multiplication numpy will not step through a's first dimension and b's second dimension (their sizes are 1).
The result's [i][j] element is equal to the product of broadcasted a's and b's [i][j] element.
(a * b)[0][0] == a[0][0] * b[0][0]
(a * b)[0][1] == a[0][1] * b[0][0] # (not stepping through b's second dimension)
(a * b)[0][2] == a[0][2] * b[0][0]
(a * b)[1][0] == a[0][0] * b[1][0] # (not stepping through a's first dimension)
etc.

How can I find the dimensions of a matrix in Python?

How can I find the dimensions of a matrix in Python. Len(A) returns only one variable.
Edit:
close = dataobj.get_data(timestamps, symbols, closefield)
Is (I assume) generating a matrix of integers (less likely strings). I need to find the size of that matrix, so I can run some tests without having to iterate through all of the elements. As far as the data type goes, I assume it's an array of arrays (or list of lists).
The number of rows of a list of lists would be: len(A) and the number of columns len(A[0]) given that all rows have the same number of columns, i.e. all lists in each index are of the same size.
If you are using NumPy arrays, shape can be used.
For example
>>> a = numpy.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
>>> a
array([[[ 1, 2, 3],
[ 1, 2, 3]],
[[12, 3, 4],
[ 2, 1, 3]]])
>>> a.shape
(2, 2, 3)
As Ayman farhat mentioned
you can use the simple method len(matrix) to get the length of rows and get the length of the first row to get the no. of columns using len(matrix[0]) :
>>> a=[[1,5,6,8],[1,2,5,9],[7,5,6,2]]
>>> len(a)
3
>>> len(a[0])
4
Also you can use a library that helps you with matrices "numpy":
>>> import numpy
>>> numpy.shape(a)
(3,4)
To get just a correct number of dimensions in NumPy:
len(a.shape)
In the first case:
import numpy as np
a = np.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
print("shape = ",np.shape(a))
print("dimensions = ",len(a.shape))
The output will be:
shape = (2, 2, 3)
dimensions = 3
m = [[1, 1, 1, 0],[0, 5, 0, 1],[2, 1, 3, 10]]
print(len(m),len(m[0]))
Output
(3 4)
The correct answer is the following:
import numpy
numpy.shape(a)
Suppose you have a which is an array. to get the dimensions of an array you should use shape.
import numpy as np
a = np.array([[3,20,99],[-13,4.5,26],[0,-1,20],[5,78,-19]])
a.shape
The output of this will be
(4,3)
You may use as following to get Height and Weight of an Numpy array:
int height = arr.shape[0]
int width = arr.shape[1]
If your array has multiple dimensions, you can increase the index to access them.
You simply can find a matrix dimension by using Numpy:
import numpy as np
x = np.arange(24).reshape((6, 4))
x.ndim
output will be:
2
It means this matrix is a 2 dimensional matrix.
x.shape
Will show you the size of each dimension. The shape for x is equal to:
(6, 4)
A simple way I look at it:
example:
h=np.array([[[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]]])
h.ndim
4
h
array([[[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]]])
If you closely observe, the number of opening square brackets at the beginning is what defines the dimension of the array.
In the above array to access 7, the below indexing is used,
h[0,1,1,0]
However if we change the array to 3 dimensions as below,
h=np.array([[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]])
h.ndim
3
h
array([[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]])
To access element 7 in the above array, the index is h[1,1,0]

Categories

Resources