numpy array slicing index - python

import numpy as np
a=np.array([ [1,2,3],[4,5,6],[7,8,9]])
How can I get zeroth index column? Expecting output [[1],[2],[3]] a[...,0] gives 1D array. Maybe next question answers this question.
How to get last 2 columns of a? a[...,1:2] gives second column only, a[...,2:3] gives last 2 columns, but a[...,3] is invalid dimension. So, how does it work?
By the way, operator ... and : have same meaning? a[...,0] and a[:,0] give same output. Can someone comment here?

numpy indexing is built on python list conventions, but extended to multi-dimensions and multi-element indexing. It is powerful, but complex, but sooner or later you should read a full indexing documentation, one that distinguishes between 'basic' and 'advanced' indexing.
Like range and arange, slice index has a 'open' stop value
In [111]: a = np.arange(1,10).reshape(3,3)
In [112]: a
Out[112]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Indexing with a scalar reduces the dimension, regardless of where:
In [113]: a[1,:]
Out[113]: array([4, 5, 6])
In [114]: a[:,1]
Out[114]: array([2, 5, 8])
That also means a[1,1] returns 5, not np.array([[5]]).
Indexing with a slice preserves the dimension:
In [115]: a[1:2,:]
Out[115]: array([[4, 5, 6]])
so does indexing with a list or array (though this makes a copy, not a view):
In [116]: a[[1],:]
Out[116]: array([[4, 5, 6]])
... is a generalized : - use as many as needed.
In [117]: a[...,[1]]
Out[117]:
array([[2],
[5],
[8]])
You can adjust dimensions with newaxis or reshape:
In [118]: a[:,1,np.newaxis]
Out[118]:
array([[2],
[5],
[8]])
Note that trailing : are automatic. a[1] is the same as a[1,:]. But leading ones must be explicit.
List indexing also removes a 'dimension/nesting layer'
In [119]: alist = [[1,2,3],[4,5,6]]
In [120]: alist[0]
Out[120]: [1, 2, 3]
In [121]: alist[0][0]
Out[121]: 1
In [122]: [l[0] for l in alist] # a column equivalent
Out[122]: [1, 4]

import numpy as np
a=np.array([ [1,2,3],[4,5,6],[7,8,9]])
a[:,0] # first colomn
>>> array([1, 4, 7])
a[0,:] # first row
>>> array([1, 2, 3])
a[:,0:2] # first two columns
>>> array([[1, 2],
[4, 5],
[7, 8]])
a[0:2,:] # first two rows
>>> array([[1, 2, 3],
[4, 5, 6]])

Related

Multi-dimensional array notation in Python

I have two arrays A and i with dimensions (1, 3, 3) and (1, 2, 2) respectively. I want to define a new array I which gives the elements of A based on i. The current and desired outputs are attached.
import numpy as np
i=np.array([[[0,0],[1,2],[2,2]]])
A = np.array([[[1,2,3],[4,5,6],[7,8,9]]], dtype=float)
I=A[0,i]
print([I])
The current output is
[array([[[[1.000000000, 2.000000000, 3.000000000],
[1.000000000, 2.000000000, 3.000000000]],
[[4.000000000, 5.000000000, 6.000000000],
[7.000000000, 8.000000000, 9.000000000]],
[[7.000000000, 8.000000000, 9.000000000],
[7.000000000, 8.000000000, 9.000000000]]]])]
The desired output is
[array(([[[1],[6],[9]]]))
In [131]: A.shape, i.shape
Out[131]: ((1, 3, 3), (1, 3, 2))
That leading size 1 dimension just adds a [] layer, and complicates indexing (a bit):
In [132]: A[0]
Out[132]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
This is the indexing that I think you want:
In [133]: A[0,i[0,:,0],i[0,:,1]]
Out[133]: array([1, 6, 9])
If you really need a trailing size 1 dimension, add it after:
In [134]: A[0,i[0,:,0],i[0,:,1]][:,None]
Out[134]:
array([[1],
[6],
[9]])
From the desired numbers, I deduced that you wanted to use the 2 columns of i as indices to two different dimensions of A:
In [135]: i[0]
Out[135]:
array([[0, 0],
[1, 2],
[2, 2]])
Another way to do the same thing:
In [139]: tuple(i.T)
Out[139]:
(array([[0],
[1],
[2]]),
array([[0],
[2],
[2]]))
In [140]: A[0][tuple(i.T)]
Out[140]:
array([[1],
[6],
[9]])
You must enter
I=A[0,:1,i[:,1]]
You can use numpy's take for that.
However, take works with a flat index, so you will need to use [0, 5, 8] for your indexes instead.
Here is an example:
>>> I = [A.shape[2] * x + y for x,y in i[0]] # Convert to flat indexes
>>> I = np.expand_dims(I, axis=(1,2))
>>> A.take(I)
array([[[1.]],
[[6.]],
[[9.]]])

What is the meaning of the last line of code, x1[ x1[:,1]>3 ]

import numpy as np
x1 = np.array([[1,2,3],[4,5,6],[7,8,9]])
x1[ x1[:,1]>3 ]
For the code shown in upon, I don't understand why the output is
array([[4, 5, 6],[7, 8, 9]]).
It will retrieve all rows whose value is greater than 3. : is used to
slice row and columns from array
Break it down:
In [10]: x1
Out[10]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [11]: x1[:,1] # select all rows, second column
Out[11]: array([2, 5, 8])
In [12]: x1[:,1]>3 # for each one of these, return whether it's > 3
Out[12]: array([False, True, True])
In [13]: x1[ x1[:,1]>3 ] # This is "Boolean array indexing"
Out[13]:
array([[4, 5, 6],
[7, 8, 9]])
The "Boolean array indexing" part filters the rows of x1 depending on the booleans contained in the boolean array x1[:,1]>3.
See Boolean array indexing in numpy doc.

Numpy sort two arrays together with one array as the keys in axis 1 [duplicate]

I'm trying to get the indices to sort a multidimensional array by the last axis, e.g.
>>> a = np.array([[3,1,2],[8,9,2]])
And I'd like indices i such that,
>>> a[i]
array([[1, 2, 3],
[2, 8, 9]])
Based on the documentation of numpy.argsort I thought it should do this, but I'm getting the error:
>>> a[np.argsort(a)]
IndexError: index 2 is out of bounds for axis 0 with size 2
Edit: I need to rearrange other arrays of the same shape (e.g. an array b such that a.shape == b.shape) in the same way... so that
>>> b = np.array([[0,5,4],[3,9,1]])
>>> b[i]
array([[5,4,0],
[9,3,1]])
Solution:
>>> a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
array([[1, 2, 3],
[2, 8, 9]])
You got it right, though I wouldn't describe it as cheating the indexing.
Maybe this will help make it clearer:
In [544]: i=np.argsort(a,axis=1)
In [545]: i
Out[545]:
array([[1, 2, 0],
[2, 0, 1]])
i is the order that we want, for each row. That is:
In [546]: a[0, i[0,:]]
Out[546]: array([1, 2, 3])
In [547]: a[1, i[1,:]]
Out[547]: array([2, 8, 9])
To do both indexing steps at once, we have to use a 'column' index for the 1st dimension.
In [548]: a[[[0],[1]],i]
Out[548]:
array([[1, 2, 3],
[2, 8, 9]])
Another array that could be paired with i is:
In [560]: j=np.array([[0,0,0],[1,1,1]])
In [561]: j
Out[561]:
array([[0, 0, 0],
[1, 1, 1]])
In [562]: a[j,i]
Out[562]:
array([[1, 2, 3],
[2, 8, 9]])
If i identifies the column for each element, then j specifies the row for each element. The [[0],[1]] column array works just as well because it can be broadcasted against i.
I think of
np.array([[0],
[1]])
as 'short hand' for j. Together they define the source row and column of each element of the new array. They work together, not sequentially.
The full mapping from a to the new array is:
[a[0,1] a[0,2] a[0,0]
a[1,2] a[1,0] a[1,1]]
def foo(a):
i = np.argsort(a, axis=1)
return (np.arange(a.shape[0])[:,None], i)
In [61]: foo(a)
Out[61]:
(array([[0],
[1]]), array([[1, 2, 0],
[2, 0, 1]], dtype=int32))
In [62]: a[foo(a)]
Out[62]:
array([[1, 2, 3],
[2, 8, 9]])
The above answers are now a bit outdated, since new functionality was added in numpy 1.15 to make it simpler; take_along_axis (https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.take_along_axis.html) allows you to do:
>>> a = np.array([[3,1,2],[8,9,2]])
>>> np.take_along_axis(a, a.argsort(axis=-1), axis=-1)
array([[1 2 3]
[2 8 9]])
I found the answer here, with someone having the same problem. They key is just cheating the indexing to work properly...
>>> a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
array([[1, 2, 3],
[2, 8, 9]])
You can also use linear indexing, which might be better with performance, like so -
M,N = a.shape
out = b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
So, a.argsort(1)+(np.arange(M)[:,None]*N) basically are the linear indices that are used to map b to get the desired sorted output for b. The same linear indices could also be used on a for getting the sorted output for a.
Sample run -
In [23]: a = np.array([[3,1,2],[8,9,2]])
In [24]: b = np.array([[0,5,4],[3,9,1]])
In [25]: M,N = a.shape
In [26]: b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
Out[26]:
array([[5, 4, 0],
[1, 3, 9]])
Rumtime tests -
In [27]: a = np.random.rand(1000,1000)
In [28]: b = np.random.rand(1000,1000)
In [29]: M,N = a.shape
In [30]: %timeit b[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
10 loops, best of 3: 133 ms per loop
In [31]: %timeit b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
10 loops, best of 3: 96.7 ms per loop

Numpy 2D array indexing by other 2D along specific axis

I have a 2D array:
>>> in_arr = np.array([[1,2],[4,3]])
array([[1, 2],
[4, 3]])
and I find the sorted indices by columns to yield another 2D array:
>>> col_sort = np.argsort(in_arr, axis=1)
array([[0, 1],
[1, 0]])
I would like to know the efficient numpy slice to index the first by the second:
>>> redordered_in_arr = np.*SOME_SLICE_METHOD*(in_arr, col_sort, axis=1)
array([[1, 2],
[3, 4]])
The intention is to then perform a (more complicated) function on the array by column, e.g.:
>>> arr_with_function = reordered_in_arr ** np.array([1,2])
array([[1, 4],
[3, 16]])
and return the elements to their original position in the array
>>> return_order = np.argsort(col_sort, axis=1)
>>> redordered_in_arr = np.*SOME_SLICE_METHOD*(arr_with_function, return_order, axis=1)
array([[1, 4],
[16, 3]])
Ok so thinking about it as I type I might just use apply_over_axis, but I would still like know how to the above efficiently in case it is of value later..
If you want to do all those operations in-place then you don't need argsort(). Numpy supports in-place operations in such situations:
In [12]: in_arr = np.array([[1,2],[4,3]])
In [13]: in_arr.sort(axis=1)
In [14]: in_arr **= [1, 2]
In [15]: in_arr
Out[15]:
array([[ 1, 4],
[ 3, 16]])
But if you need the indices of the sorted items you can get the expected result with a simple indexing.
In [18]: in_arr[np.arange(2)[:,None], col_sort]
Out[18]:
array([[1, 2],
[3, 4]])

Can you multiply one element of a NumPy array and get the entire array as a result?

I have a multidimensional NumPy array, and I want to multiply the first element of each sub-array by some number. If I create an array and use slice notation to just get the elements I want to multiply, it returns just those elements in a new array, not the rest of the elements in the original array. How can I multiply the first elements and keep them in the original array?
Example: I do this
>>> arr = np.array([[1,2,3],[4,5,6]])
>>> arr
array([[1, 2, 3],
[4, 5, 6]])
>>> arr[:,0] * 5
and I get this
array([ 5, 20])
but I would like to get this
array([[ 5, 2, 3],
[20, 5, 6]])
You need to reassign the results:
In [8]: arr[:, 0] = arr[:, 0] * 5
In [9]: arr
Out[9]:
array([[ 5, 2, 3],
[20, 5, 6]])
try this:
arr = np.array([[1,2,3],[4,5,6]])
arr[:,0]*=5
The good old multiply AND assignment operator

Categories

Resources