I am learning numpy framework.This piece of code I don't understand.
import numpy as np
a =np.array([[0,1,2],[3,4,5],[6,7,8],[9,10,11]])
print(a)
row = np.array([[0,0],[3,3]])
col = np.array([[0,2],[0,2]])
b = a[row,col]
print("This is b array:",b)
This b array returns the corner values of a array, that is, b equals [[0,2],[9,11]].
When indexing is done using an array or "array-like", to access/modify the elements of an array, then it's called advanced indexing.
In [37]: a
Out[37]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [38]: row
Out[38]:
array([[0, 0],
[3, 3]])
In [39]: col
Out[39]:
array([[0, 2],
[0, 2]])
In [40]: a[row, col]
Out[40]:
array([[ 0, 2],
[ 9, 11]])
That's what you got. Below is an explanation:
Indices of
`a[row, col]` row column
|| || || ||
VV VV VV VV
a[0, 0] a[0, 2]
a[3, 0] a[3, 2]
|__________| |
row-idx array |
|__________|
column-idx array
You're indexing a using two equally shaped 2d-arrays, hence you're output array will also have the same shape as col and row. To better understand how array indexing works you can check the docs, where as shown, indexing with 1d-arrays over the existing axis' of a given array works as follows:
result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],
..., ind_N[i_1, ..., i_M]]
Where the same logic applies in the case of indexing with 2d-arrays over each axis, but instead you'd have a result array with up to i_N_M indices.
So going back to your example you are essentially selecting from the rows of a based on row, and from those rows you are selecting some columns col. You might find it more intuitive to translate the row and column indices into (x,y) coordinates:
(0,0), (0,2)
(3,0), (3,2)
Which, by accordingly selecting from a, results in the output array:
print(a[row,col])
array([[ 0, 2],
[ 9, 11]])
You can understand it by making more tries, to see more examples.
If you have one dimensional index:
In [58]: np.arange(10)[np.array([1,3,4,6])]
Out[58]: array([1, 3, 4, 6])
In case of two dimensional index:
In [57]: np.arange(10)[np.array([[1,3],[4,6]])]
Out[57]:
array([[1, 3],
[4, 6]])
If you use 3 dimensional index:
In [59]: np.arange(10)[np.array([[[1],[3]],[[4],[6]]])]
Out[59]:
array([[[1],
[3]],
[[4],
[6]]])
As you can see, if you make hierarchy in indexing, you will get it in the output as well.
Proceeding by steps:
import numpy as np
a = np.array([[0,1,2],[3,4,5],[6,7,8],[9,10,11]])
print(a)
gives 2d array a:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
Then:
row = np.array([[0,0],[3,3]])
assigns to 2d array row values [0,0] and [3,3]:
array([[0, 0],
[3, 3]])
Then:
col = np.array([[0,2],[0,2]])
assigns to 2d array col values [0,2] and [0,2]:
array([[0, 2],
[0, 2]])
Finally:
b = a[row,col]
assigns to b values given by a[0,0], a[0,2] for the first row, a[3,0], a[3,2] for the second row, that is:
array([[ 0, 2],
[ 9, 11]])
Where does b[0,0] <-- a[0,0] come from? It comes from the combination of row[0,0] which is 0 and col[0,0] which is 0.
What about b[0,1] <-- a[0,2]? It comes from the combination of row[0,1] which is 0 and col[0,1] which is 2.
And so forth.
Related
The output of the two commands below gives a different array shape, I do appreciate explaining why and referring me to a reference if any, I searched the internet but did not find any clear explanation for it.
data.shape
(11,2)
# outputs the values in column-0 in an (1x11) array.
data[:,0]
array([-7.24070e-01, -2.40724e+00, 2.64837e+00, 3.60920e-01,
6.73120e-01, -4.54600e-01, 2.20168e+00, 1.15605e+00,
5.06940e-01, -8.59520e-01, -5.99700e-01])
# outputs the values in column-0 in an (11x1) array
data[:,:-1]
array([[-7.24070e-01],
[-2.40724e+00],
[ 2.64837e+00],
[ 3.60920e-01],
[ 6.73120e-01],
[-4.54600e-01],
[ 2.20168e+00],
[ 1.15605e+00],
[ 5.06940e-01],
[-8.59520e-01],
[-5.99700e-01]])
I'll try to consolidate the comments into an answer.
First look at Python list indexing
In [92]: alist = [1,2,3]
selecting an item:
In [93]: alist[0]
Out[93]: 1
making a copy of the whole list:
In [94]: alist[:]
Out[94]: [1, 2, 3]
or a slice of length 2, or 1 or 0:
In [95]: alist[:2]
Out[95]: [1, 2]
In [96]: alist[:1]
Out[96]: [1]
In [97]: alist[:0]
Out[97]: []
Arrays follow the same basic rules
In [98]: x = np.arange(12).reshape(3,4)
In [99]: x
Out[99]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Select a row:
In [100]: x[0]
Out[100]: array([0, 1, 2, 3])
or a column:
In [101]: x[:,0]
Out[101]: array([0, 4, 8])
x[0,1] selects an single element.
https://numpy.org/doc/stable/user/basics.indexing.html#single-element-indexing
Indexing with a slice returns multiple rows:
In [103]: x[0:2]
Out[103]:
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
In [104]: x[0:1] # it retains the dimensions, even if only 1 (or even 0)
Out[104]: array([[0, 1, 2, 3]])
Likewise for columns:
In [106]: x[:,0:1]
Out[106]:
array([[0],
[4],
[8]])
subslices on both dimensions:
In [107]: x[0:2,1:3]
Out[107]:
array([[1, 2],
[5, 6]])
https://numpy.org/doc/stable/user/basics.indexing.html
x[[0]] also returns a 2d array, but that gets into "advanced" indexing (which doesn't have a list equivalent).
The data type intp is mentioned in the following table:
Also, what does indexing means?
Integer array indexing
Integer array indexing allows selection of arbitrary items in the array based on their N-dimensional index. Each integer array represents a number of indexes into that dimension.
Purely integer array indexing
When the index consists of as many integer arrays as the array being indexed has dimensions, the indexing is straight forward, but different from slicing.
Advanced indexes always are broadcast and iterated as one:
result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],
..., ind_N[i_1, ..., i_M]]
Note that the result shape is identical to the (broadcast) indexing array shapes ind_1, ..., ind_N.
Example
From each row, a specific element should be selected. The row index is just [0, 1, 2] and the column index specifies the element to choose for the corresponding row, here [0, 1, 0]. Using both together the task can be solved using advanced indexing:
x = np.array([[1, 2], [3, 4], [5, 6]])
x[[0, 1, 2], [0, 1, 0]]
array([1, 4, 5])
To achieve a behaviour similar to the basic slicing above, broadcasting can be used. The function ix_ can help with this broadcasting. This is best understood with an example.
Example
From a 4x3 array the corner elements should be selected using advanced indexing. Thus all elements for which the column is one of [0, 2] and the row is one of [0, 3] need to be selected. To use advanced indexing one needs to select all elements explicitly. Using the method explained previously one could write:
x = np.array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
rows = np.array([[0, 0],
[3, 3]], dtype=np.intp)
columns = np.array([[0, 2],
[0, 2]], dtype=np.intp)
x[rows, columns]
array([[ 0, 2],
[ 9, 11]])
However, since the indexing arrays above just repeat themselves, broadcasting can be used (compare operations such as rows[:, np.newaxis] + columns) to simplify this:
rows = np.array([0, 3], dtype=np.intp)
columns = np.array([0, 2], dtype=np.intp)
rows[:, np.newaxis]
array([[0],
[3]])
x[rows[:, np.newaxis], columns]
array([[ 0, 2],
[ 9, 11]])
This broadcasting can also be achieved using the function ix_:
x[np.ix_(rows, columns)]
array([[ 0, 2],
[ 9, 11]])
Note that without the np.ix_ call, only the diagonal elements would be selected, as was used in the previous example. This difference is the most important thing to remember about indexing with multiple advanced indexes.
Please read this.
REF: https://numpy.org/doc/stable/reference/arrays.indexing.html
My goal was to insert a column to the right on a numpy matrix. However, I found that the code I was using is putting in two columns rather than just one.
# This one results in a 4x1 matrix, as expected
np.insert(np.matrix([[0],[0]]), 1, np.matrix([[0],[0]]), 0)
>>>matrix([[0],
[0],
[0],
[0]])
# I would expect this line to return a 2x2 matrix, but it returns a 2x3 matrix instead.
np.insert(np.matrix([[0],[0]]), 1, np.matrix([[0],[0]]), 1)
>>>matrix([[0, 0, 0],
[0, 0, 0]]
Why do I get the above, in the second example, instead of [[0,0], [0,0]]?
While new use of np.matrix is discouraged, we get the same result with np.array:
In [41]: np.insert(np.array([[1],[2]]),1, np.array([[10],[20]]), 0)
Out[41]:
array([[ 1],
[10],
[20],
[ 2]])
In [42]: np.insert(np.array([[1],[2]]),1, np.array([[10],[20]]), 1)
Out[42]:
array([[ 1, 10, 20],
[ 2, 10, 20]])
In [44]: np.insert(np.array([[1],[2]]),1, np.array([10,20]), 1)
Out[44]:
array([[ 1, 10],
[ 2, 20]])
Insert as [1]:
In [46]: np.insert(np.array([[1],[2]]),[1], np.array([[10],[20]]), 1)
Out[46]:
array([[ 1, 10],
[ 2, 20]])
In [47]: np.insert(np.array([[1],[2]]),[1], np.array([10,20]), 1)
Out[47]:
array([[ 1, 10, 20],
[ 2, 10, 20]])
np.insert is a complex function written in Python. So we need to look at that code, and see how values are being mapped on the target space.
The docs elaborate on the difference between insert at 1 and [1]. But off hand I don't see an explanation of how the shape of values matters.
Difference between sequence and scalars:
>>> np.insert(a, [1], [[1],[2],[3]], axis=1)
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
>>> np.array_equal(np.insert(a, 1, [1, 2, 3], axis=1),
... np.insert(a, [1], [[1],[2],[3]], axis=1))
True
When adding an array at the end of another, I'd use concatenate (or one of its stack variants) rather than insert. None of these operate in-place.
In [48]: np.concatenate([np.array([[1],[2]]), np.array([[10],[20]])], axis=1)
Out[48]:
array([[ 1, 10],
[ 2, 20]])
I have an array I wan to use for mapping. Let's call it my_map ,type float shape (m,c).
I have a second array with indexes, lest call it my_indexes, type int size (n,c), every value is between 0 and m.
Trying to index my_map doing my_ans = my_map[my_indexes] I get an array of shape (n,c,c), when I was expecting (n,c). What would be the proper way to do it?
Just to be clear, what I am trying to do is something equivalent to:
my_ans = np.empty_like(touch_probability)
for i in range(c):
my_ans[:,i] = my_map[:,i][my_indexes[:,i]]
To illustrate and test your problem, define simple, real arrays:
In [44]: arr = np.arange(12).reshape(3,4)
In [45]: idx = np.array([[0,2,1,0],[2,2,1,0]])
In [46]: arr.shape
Out[46]: (3, 4)
In [47]: idx.shape
Out[47]: (2, 4)
Your desired calculation:
In [48]: res = np.zeros((2,4), int)
In [49]: for i in range(4):
...: res[:,i] = arr[:,i][idx[:,i]] # same as arr[idx[:,i], i]
...:
In [50]: res
Out[50]:
array([[0, 9, 6, 3],
[8, 9, 6, 3]])
Doing the same with one indexing step:
In [51]: arr[idx, np.arange(4)]
Out[51]:
array([[0, 9, 6, 3],
[8, 9, 6, 3]])
This is broadcasting the two indexing arrays against each other, and then picking points:
In [52]: np.broadcast_arrays(idx, np.arange(4))
Out[52]:
[array([[0, 2, 1, 0],
[2, 2, 1, 0]]),
array([[0, 1, 2, 3],
[0, 1, 2, 3]])]
So we are indexing the (m,c) array with 2 (n,c) arrays
The following are the same:
arr[idx]
arr[idx, :]
It is using idx to select whole rows from arr, so the result is shape of idx plus the last dimension of arr. Where as what you want is just the ith element of the idx[j,i] row.
In python have three one dimensional arrays of different shapes (like the ones given below)
a0 = np.array([5,6,7,8,9])
a1 = np.array([1,2,3,4])
a2 = np.array([11,12])
I am assuming that the array a0 corresponds to an index i=0, a1 corresponds to index i=1 and a2 corresponds to i=2. With these assumptions I want to construct a new two dimensional array where the rows would correspond to indices of the arrays (i=0,1,2) and the columns would be entries of the arrays a0, a1, a2.
In the example that I have given here, I will like the two dimensional array to look like
result = np.array([ [0,5], [0,6], [0,7], [0,8], [0,9], [1,1], [1,2],\
[1,3], [1,4], [2,11], [2,12] ])
I will very appreciate to have an answer as to how I can achieve this. In the actual problem that I am working with, I am dealing more than three one dimensional arrays. So, it will be very nice if the answer gives consideration to this.
One way to do this would be a simple list comprehension:
result = np.array([[i, arr_v] for i, arr in enumerate([a0, a1, a2])
for arr_v in arr])
>>> result
array([[ 0, 5],
[ 0, 6],
[ 0, 7],
[ 0, 8],
[ 0, 9],
[ 1, 1],
[ 1, 2],
[ 1, 3],
[ 1, 4],
[ 2, 11],
[ 2, 12]])
Adressing your concern about scaling this to more arrays, you can easily add as many arrays as you wish by simply creating a list of your array names, and using that list as the argument to enumerate:
.... for i, arr in enumerate(my_list_of_arrays) ...
You can use numpy stack functions to speed up:
aa = [a0, a1, a2]
np.hstack(tuple(np.vstack((np.full(ai.shape, i), ai)) for i, ai in enumerate(aa))).T
Here's an almost vectorized approach -
L = [a0,a1,a2] # list of all arrays
lens = [len(i) for i in L] # only looping part*
out = np.dstack(( np.repeat(np.arange(len(L)), lens), np.concatenate(L)))
*The looping part is simply to get the lengths of the arrays, which should have negligible impact on the total runtime.
Sample run -
In [19]: L = [a0,a1,a2] # list of all arrays
In [20]: lens = [len(i) for i in L]
In [21]: np.dstack(( np.repeat(np.arange(len(L)), lens), np.concatenate(L)))
Out[21]:
array([[[ 0, 5],
[ 0, 6],
[ 0, 7],
[ 0, 8],
[ 0, 9],
[ 1, 1],
[ 1, 2],
[ 1, 3],
[ 1, 4],
[ 2, 11],
[ 2, 12]]])
Another way could be to avoid np.repeat and use some array-initialization + cumsum method, which would be better for large number of arrays, as shown below -
col1 = np.concatenate(L)
col0 = np.zeros(len(col1), dtype=col1.dtype)
col0[np.cumsum(lens[:-1])] = 1
out = np.dstack((col0.cumsum(), col1))
Or use np.maximum.accumulate to replace the second cumsum -
col0[np.cumsum(lens[:-1])] = np.arange(1,len(L))
out = np.dstack((np.maximum.accumulate(col0), col1))