I'm having a hard time understanding how some of numpy's slicing and indexing works
First one is the following:
>>> x = np.array([[[1],[2],[3]], [[4],[5],[6]]])
>>> x.shape
(2, 3, 1)
>>> x[1:2]
array([[[4],
[5],
[6]]])
According to the documentation,
If the number of objects in the selection tuple is less than N , then
: is assumed for any subsequent dimensions.
So does that means [[1], [2], [3]] , [[4], [5], [6]] is a 2x3 array itself?
And how does
x[1:2]
return
array([[[4],
[5],
[6]]])
?
The second is ellipsis,
>>> x[...,0]
array([[1, 2, 3],
[4, 5, 6]])
Ellipsis expand to the number of : objects needed to make a selection
tuple of the same length as x.ndim. There may only be a single
ellipsis present.
Why does [...,0] means?
For your first question, it means that x of shape (2, 3, 1) has 2 slices of 3x1 arrays.
In [40]: x
Out[40]:
array([[[1],
[2], # <= slice 1 of shape 3x1
[3]],
[[4],
[5], # <= slice 2 of shape 3x1
[6]]])
Now, when you execute x[1:2], it just hands you over the first slice but not including the second slice since in Python & NumPy it's always left inclusive and right exclusive (something like half-open interval, i.e. [1,2) )
In [42]: x[1:2]
Out[42]:
array([[[4],
[5],
[6]]])
This is why you just get the first slice.
For your second question,
In [45]: x.ndim
Out[45]: 3
So, when you use ellipsis, it just stretches out your array to size 3.
In [47]: x[...,0]
Out[47]:
array([[1, 2, 3],
[4, 5, 6]])
The above code means, you take both slices from the array x, and stretch it row-wise.
But instead, if you do
In [49]: x[0, ..., 0]
Out[49]: array([1, 2, 3])
Here, you just take the first slice from x and stretch it row-wise.
Now, when you execute x[1:2], it just hands you over the first slice.
My question is shouldn't it be second slice. As the output is slice 2
In [42]: x[1:2]
Out[42]:
array([[[4],
[5],
[6]]])
Related
I have two arrays A and i with dimensions (1, 3, 3) and (1, 2, 2) respectively. I want to define a new array I which gives the elements of A based on i. The current and desired outputs are attached.
import numpy as np
i=np.array([[[0,0],[1,2],[2,2]]])
A = np.array([[[1,2,3],[4,5,6],[7,8,9]]], dtype=float)
I=A[0,i]
print([I])
The current output is
[array([[[[1.000000000, 2.000000000, 3.000000000],
[1.000000000, 2.000000000, 3.000000000]],
[[4.000000000, 5.000000000, 6.000000000],
[7.000000000, 8.000000000, 9.000000000]],
[[7.000000000, 8.000000000, 9.000000000],
[7.000000000, 8.000000000, 9.000000000]]]])]
The desired output is
[array(([[[1],[6],[9]]]))
In [131]: A.shape, i.shape
Out[131]: ((1, 3, 3), (1, 3, 2))
That leading size 1 dimension just adds a [] layer, and complicates indexing (a bit):
In [132]: A[0]
Out[132]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
This is the indexing that I think you want:
In [133]: A[0,i[0,:,0],i[0,:,1]]
Out[133]: array([1, 6, 9])
If you really need a trailing size 1 dimension, add it after:
In [134]: A[0,i[0,:,0],i[0,:,1]][:,None]
Out[134]:
array([[1],
[6],
[9]])
From the desired numbers, I deduced that you wanted to use the 2 columns of i as indices to two different dimensions of A:
In [135]: i[0]
Out[135]:
array([[0, 0],
[1, 2],
[2, 2]])
Another way to do the same thing:
In [139]: tuple(i.T)
Out[139]:
(array([[0],
[1],
[2]]),
array([[0],
[2],
[2]]))
In [140]: A[0][tuple(i.T)]
Out[140]:
array([[1],
[6],
[9]])
You must enter
I=A[0,:1,i[:,1]]
You can use numpy's take for that.
However, take works with a flat index, so you will need to use [0, 5, 8] for your indexes instead.
Here is an example:
>>> I = [A.shape[2] * x + y for x,y in i[0]] # Convert to flat indexes
>>> I = np.expand_dims(I, axis=(1,2))
>>> A.take(I)
array([[[1.]],
[[6.]],
[[9.]]])
Is there a way to iterate over the columns of a 2D numpy array such that the iterators remain column vectors?
i.e.
>>> A = np.arange(9).reshape((3,3))
[[0 1 2]
[3 4 5]
[6 7 8]]
>>> np.hstack([a in some_way_of_iterating(A)])
[[0 1 2]
[3 4 5]
[6 7 8]]
This is useful, for example, when I want to pass the column vectors into a function that transforms the individual column vector without having to clutter stuff with reshapes
How about simple transpose:
B = np.hstack([a.reshape(-1,1) for a in A.T])
You require .reshape(-1,1) to get size of n X 1 instead of just n
In [39]: A = np.arange(1,10).reshape(3,3)
In [40]: A
Out[40]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Iteration on an array operates on the first dimension. It's much like iterating on a nested list - but slower. And like the list case it too reduces the dimension.
You could iterate on the range, and use advanced indexing, [i] to maintain the 2d, "column vector" shape:
In [41]: [A[:,[i]] for i in range(3)]
Out[41]:
[array([[1],
[4],
[7]]),
array([[2],
[5],
[8]]),
array([[3],
[6],
[9]])]
Or iterate on the transpose - but this still requires some form of reshape. I prefer the None/newaxis syntax.
In [42]: [a[:,None] for a in A.T]
Out[42]:
[array([[1],
[4],
[7]]),
array([[2],
[5],
[8]]),
array([[3],
[6],
[9]])]
Indexing and reshape can be combined with:
In [43]: A[:,0,None]
Out[43]:
array([[1],
[4],
[7]])
Or with slicing:
In [44]: A[:,1:2]
Out[44]:
array([[2],
[5],
[8]])
There is a difference that may matter. A[:,[i]] makes a copy, A[:,i,None] is a view.
This may be the time to reread the basic numpy indexing docs.
https://numpy.org/doc/stable/reference/arrays.indexing.html
An ugly but another possible way with index and transpose:
np.hstack([A[:,i][np.newaxis].T for i in range(len(A.T))])
I am using np.newaxis to facilitate the transpose. Based on #hpaulj suggestion this can be significantly cleaned off:
np.hstack([A[:,i,np.newaxis] for i in range(A.shape[1])])
Output:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
import numpy as np
a=np.array([ [1,2,3],[4,5,6],[7,8,9]])
How can I get zeroth index column? Expecting output [[1],[2],[3]] a[...,0] gives 1D array. Maybe next question answers this question.
How to get last 2 columns of a? a[...,1:2] gives second column only, a[...,2:3] gives last 2 columns, but a[...,3] is invalid dimension. So, how does it work?
By the way, operator ... and : have same meaning? a[...,0] and a[:,0] give same output. Can someone comment here?
numpy indexing is built on python list conventions, but extended to multi-dimensions and multi-element indexing. It is powerful, but complex, but sooner or later you should read a full indexing documentation, one that distinguishes between 'basic' and 'advanced' indexing.
Like range and arange, slice index has a 'open' stop value
In [111]: a = np.arange(1,10).reshape(3,3)
In [112]: a
Out[112]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Indexing with a scalar reduces the dimension, regardless of where:
In [113]: a[1,:]
Out[113]: array([4, 5, 6])
In [114]: a[:,1]
Out[114]: array([2, 5, 8])
That also means a[1,1] returns 5, not np.array([[5]]).
Indexing with a slice preserves the dimension:
In [115]: a[1:2,:]
Out[115]: array([[4, 5, 6]])
so does indexing with a list or array (though this makes a copy, not a view):
In [116]: a[[1],:]
Out[116]: array([[4, 5, 6]])
... is a generalized : - use as many as needed.
In [117]: a[...,[1]]
Out[117]:
array([[2],
[5],
[8]])
You can adjust dimensions with newaxis or reshape:
In [118]: a[:,1,np.newaxis]
Out[118]:
array([[2],
[5],
[8]])
Note that trailing : are automatic. a[1] is the same as a[1,:]. But leading ones must be explicit.
List indexing also removes a 'dimension/nesting layer'
In [119]: alist = [[1,2,3],[4,5,6]]
In [120]: alist[0]
Out[120]: [1, 2, 3]
In [121]: alist[0][0]
Out[121]: 1
In [122]: [l[0] for l in alist] # a column equivalent
Out[122]: [1, 4]
import numpy as np
a=np.array([ [1,2,3],[4,5,6],[7,8,9]])
a[:,0] # first colomn
>>> array([1, 4, 7])
a[0,:] # first row
>>> array([1, 2, 3])
a[:,0:2] # first two columns
>>> array([[1, 2],
[4, 5],
[7, 8]])
a[0:2,:] # first two rows
>>> array([[1, 2, 3],
[4, 5, 6]])
Let's say I have a row vector of the shape (1, 256). I want to transform it into a column vector of the shape (256, 1) instead. How would you do it in Numpy?
you can use the transpose operation to do this:
Example:
In [2]: a = np.array([[1,2], [3,4], [5,6]])
In [5]: a.shape
Out[5]: (3, 2)
In [6]: a_trans = a.T #or: np.transpose(a), a.transpose()
In [8]: a_trans.shape
Out[8]: (2, 3)
In [7]: a_trans
Out[7]:
array([[1, 3, 5],
[2, 4, 6]])
Note that the original array a will still remain unmodified. The transpose operation will just make a copy and transpose it.
If your input array is rather 1D, then you can promote the array to a column vector by introducing a new (singleton) axis as the second dimension. Below is an example:
# 1D array
In [13]: arr = np.arange(6)
# promotion to a column vector (i.e., a 2D array)
In [14]: arr = arr[..., None] #or: arr = arr[:, np.newaxis]
In [15]: arr
Out[15]:
array([[0],
[1],
[2],
[3],
[4],
[5]])
In [12]: arr.shape
Out[12]: (6, 1)
For the 1D case, yet another option would be to use numpy.atleast_2d() followed by a transpose operation, as suggested by ankostis in the comments.
In [9]: np.atleast_2d(arr).T
Out[9]:
array([[0],
[1],
[2],
[3],
[4],
[5]])
We can simply use the reshape functionality of numpy:
a=np.array([[1,2,3,4]])
a:
array([[1, 2, 3, 4]])
a.shape
(1,4)
b=a.reshape(-1,1)
b:
array([[1],
[2],
[3],
[4]])
b.shape
(4,1)
Some of the ways I have compiled to do this are:
>>> import numpy as np
>>> a = np.array([1, 2, 3], [2, 4, 5])
>>> a
array([[1, 2],
[2, 4],
[3, 5]])
Another way to do it:
>>> a.T
array([[1, 2],
[2, 4],
[3, 5]])
Another way to do this will be:
>>> a.reshape(a.shape[1], a.shape[0])
array([[1, 2],
[3, 2],
[4, 5]])
I have used a 2-dimensional array in all of these problems, the real problem arises when there is a 1-dimensional row vector which you want to columnize elegantly.
Numpy's reshape has a functionality where you pass the one of the dimension (number of rows or number of columns) you want, numpy can figure out the other dimension by itself if you pass the other dimension as -1
>>> a.reshape(-1, 1)
array([[1],
[2],
[3],
[2],
[4],
[5]])
>>> a = np.array([1, 2, 3])
>>> a.reshape(-1, 1)
array([[1],
[2],
[3]])
>>> a.reshape(2, -1)
...
ValueError: cannot reshape array of size 3 into shape (2,newaxis)
So, you can give your choice of 1-dimension without worrying about the other dimension as long as (m * n) / your_choice is an integer.
If you want to know more about this -1, head over to:
What does -1 mean in numpy reshape?
Note: All these operations return a new array and do not modify the original array.
You can use reshape() method of numpy object.
To transform any row vector to column vector, use
array.reshape(-1, 1)
To convert any column vector to row vector, use
array.reshape(1, -1)
reshape() is used to change the shape of the matrix.
So if you want to create a 2x2 matrix you can call the method like a.reshape(2, 2).
So why this -1 in the answer?
If you dont want to explicitly specify one dimension(or unknown dimension) and wants numpy to find the value for you, you can pass -1 to that dimension. So numpy will automatically calculate the the value for you from the ramaining dimensions. Keep in mind that you can not pass -1 to more than one dimension.
Thus in the first case(array.reshape(-1, 1)) the second dimension(column) is one(1) and the first(row) is unknown(-1). So numpy will figure out how to represent a 1-by-4 to x-by-1 and finds the x for you.
An alternative solutions with reshape method will be a.reshape(a.shape[1], a.shape[0]). Here you are explicitly specifying the diemsions.
Using np.newaxis can be a bit counterintuitive. But it is possible.
>>> a = np.array([1,2,3])
>>> a.shape
(3,)
>>> a[:,np.newaxis].shape
(3, 1)
>>> a[:,None]
array([[1],
[2],
[3]])
np.newaxis is equal to None internally. So you can use None.
But it is not recommended because it impairs readability
To convert a row vector into a column vector in Python can be important e.g. to use broadcasting:
import numpy as np
def colvec(rowvec):
v = np.asarray(rowvec)
return v.reshape(v.size,1)
colvec([1,2,3]) * [[1,2,3], [4,5,6], [7,8,9]]
Multiplies the first row by 1, the second row by 2 and the third row by 3:
array([[ 1, 2, 3],
[ 8, 10, 12],
[ 21, 24, 27]])
In contrast, trying to use a column vector typed as matrix:
np.asmatrix([1, 2, 3]).transpose() * [[1,2,3], [4,5,6], [7,8,9]]
fails with error ValueError: shapes (3,1) and (3,3) not aligned: 1 (dim 1) != 3 (dim 0).
I want to reduce the dimensions of an array after converting it to a list
a = np.array([[1,2],[3,4]])
print a.shape
b = np.array([[1],[3,4]])
print b.shape
Output:
(2, 2)
(2,)
I want a to have the same shape as b i.e. (2,)
>>> a = np.array([[1,2],[3,4], None])[:2]
>>> a
array([[1, 2], [3, 4]], dtype=object)
>>> a.shape
(2,)
Works, though is probably the wrong way to do it (I'm a numpy newb).
Do you understand what b is?
b = np.array([[1],[3,4]])
print(repr(b))
array([[1], [3, 4]], dtype=object)
b is a 1d array with 2 elements, each a list. np.array does this way because the 2 sublists have different length, so it can't create a 2d array.
a = np.array([[1,2],[3,4]])
print(repr(a))
array([[1, 2],
[3, 4]])
Here the 2 sublists have the same length, so it can create a 2d array. Each element is an integer. np.array tries to create the highest dimensional array that the input allows.
Probably the best way to create another array like b is to make a copy, and insert the desired lists.
a1 = b.copy()
a1[0] = [1,2]
# a1[1] = [3,4]
print(repr(a1))
array([[1, 2], [3, 4]], dtype=object)
You have to use this convoluted method because you trying to do something 'unnatural'.
You comment about using vstack. Both work:
In [570]: np.vstack((a,b)) # (3,2) array
Out[570]:
array([[1, 2],
[3, 4],
[[1], [3, 4]]], dtype=object)
In [571]: np.vstack((a1,b)) # (2,2) array
Out[571]:
array([[[1, 2], [3, 4]],
[[1], [3, 4]]], dtype=object)
Your array b is little more than the original list in an array wrapper. Is that really what you need? The 2d a is a normal numpy array. b is an oddball construction.