Slicing arrays with lists - python

So, I create a numpy array:
a = np.arange(25).reshape(5,5)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
A conventional slice a[1:3,1:3] returns
array([[ 6, 7],
[11, 12]])
as does using a list in the second a[1:3,[1,2]]
array([[ 6, 7],
[11, 12]])
However, a[[1,2],[1,2]] returns
array([ 6, 12])
Obviously I am not understanding something here. That said, slicing with a list might on occasion be very useful.
Cheers,
keng

You observed effect of so-called Advanced Indexing. Let consider example from link:
import numpy as np
x = np.array([[1, 2], [3, 4], [5, 6]])
print(x)
[[1 2]
[3 4]
[5 6]]
print(x[[0, 1, 2], [0, 1, 0]]) # [1 4 5]
You might think about this as providing lists of (Cartesian) coordinates of grid, as
print(x[0,1]) # 1
print(x[1,1]) # 4
print(x[2,0]) # 5

In the last case, the two individual lists are treated as separate indexing operations (this is really awkward wording so please bear with me).
Numpy sees two lists of two integers and decides that you are therefore asking for two values. The row index of each value comes from the first list, while the column index of each value comes from the second list. Therefore, you get a[1,1] and a[2,2]. The : notation not only expands to the list you've accurately deduced, but also tells numpy that you want all the rows/columns in that range.
If you provide manually curated list indices, they have to be of the same size, because the size of each/any list is the number of elements you'll get back. For example, if you wanted the elements in columns 1 and 2 of rows 1,2,3:
>>> a[1:4,[1,2]]
array([[ 6, 7],
[11, 12],
[16, 17]])
But
>>> a[[1,2,3],[1,2]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)
The former tells numpy that you want a range of rows and specific columns, while the latter says "get me the elements at (1,1), (2,2), and (3, hey! what the?! where's the other index?)"

a[[1,2],[1,2]] is reading this as, I want a[1,1] and a[2,2]. There are a few ways around this and I likely don't even have the best ways but you could try
a[[1,1,2,2],[1,2,1,2]]
This will give you a flattened version of above
a[[1,2]][:,[1,2]]
This will give you the correct slice, it works be taking the rows [1,2] and then columns [1,2].

It triggers advanced indexing so first slice is the row index, second is the column index. For each row, it selects the corresponding column.
a[[1,2], [1,2]] -> [a[1, 1], a[2, 2]] -> [6, 12]

Related

Reverse Index through a numPy ndarray

I have a n x n dimensional numpy array of eigenvectors as columns, and want to return the last v of them as another array. However, they are currently in ascending order, and I wish to return them in descending order.
Currently, I'm attempting to index as follows
eigenvector_array[:,-1:-v]
But this doesn't seem to be working. Is there a more efficient way to do this?
Given a 2d array:
In [44]: x = np.arange(15).reshape(3,5);x
Out[44]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
the last 3 columns:
In [45]: x[:,-3:]
Out[45]:
array([[ 2, 3, 4],
[ 7, 8, 9],
[12, 13, 14]])
and reversing their order:
In [46]: x[:,-3:][:,::-1]
Out[46]:
array([[ 4, 3, 2],
[ 9, 8, 7],
[14, 13, 12]])
Or we could reverse the order, and take the first n. x[:,::-1][:,:3]
I tried combining the selection and order, but getting the end points is trickier. Separating the operations is easier.
x[:,:-4:-1]
Lets re-write this to make it a little less confusing.
eigenvector_array[:,-1:-v]
to:
eigenvector_array[:][-1:-v]
Now remember how slicing works in python:
[start:stop:step]
If you set step. to -1 it will return them in reverse, so:
eigenvector_array[:,-1:-v:-1] should be your answer.

if i have ndarray as say n=np.arange(16).reshape(4,4), which is the second to last axis?

I was going through one of the documentation of NumPy module, I come across something like : If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b, I'm beginner to NumPy I thought there are only 2 axes 0 ( rows) and 1( columns) could someone please explain what it means? if I have ND array as say n=np.arange(16).reshape(4,4), which is the second to last axis?
when you first think of it as a simple data structure, you can think of 2-dimensional arrays as rows and columns. But here, instead of saying 0:represents row and 1:column, it is more correct to say 0:represents data and 1:represents dimensions.
In other words, you need to look at the dimension-based, not the axis-based.
np.arange(16).reshape(4,4)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
Here, we get an array with n*m(4*4) ie 4 dimensions, and 16 data in it.
Below, we obtain a 2-dimensional array containing 16 data.
np.arange(16).reshape(8,2)
array([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11],
[12, 13],
[14, 15]])
As for the question you want to learn.
a=np.arange(16).reshape(4,4)
print(a[:,-2])
array([ 2, 6, 10, 14])
The above expression returns data in the second-to-last dimension.

Iterating Over Rows in Python Array to Extract Column Data

I am using Python and looking to iterate through each row of an Nx9 array and extract certain values from the row to form another matrix with them. The N value can change depending on the file I am reading but I have used N=3 in my example. I only require the 0th, 1st, 3rd and 4th values of each row to form into an array which I need to store. E.g:
result = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9],
[11, 12, 13, 14, 15, 16, 17, 18, 19],
[21, 22, 23, 24, 25, 26, 27, 28, 29]])
#Output matrix of first row should be: ([[1,2],[4,5]])
#Output matrix of second row should be: ([[11,12],[14,15]])
#Output matrix of third row should be: ([[21,22],[24,25]])
I should then end up with N number of matrices formed with the extracted values - a 2D matrix for each row. However, the matrices formed appear 3D so when transposed and subtracted I receive the error ValueError: operands could not be broadcast together with shapes (2,2,3) (3,2,2). I am aware that a (3,2,2) matrix cannot be subtracted from a (2,2,3) so how do I obtain a 2D matrix N number of times? Would a loop be better suited? Any suggestions?
import numpy as np
result = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9],
[11, 12, 13, 14, 15, 16, 17, 18, 19],
[21, 22, 23, 24, 25, 26, 27, 28, 29]])
a = result[:, 0]
b = result[:, 1]
c = result[:, 2]
d = result[:, 3]
e = result[:, 4]
f = result[:, 5]
g = result[:, 6]
h = result[:, 7]
i = result[:, 8]
output = [[a, b], [d, e]]
output = np.array(output)
output_transpose = output.transpose()
result = 0.5 * (output - output_transpose)
In [276]: result = np.array(
...: [
...: [1, 2, 3, 4, 5, 6, 7, 8, 9],
...: [11, 12, 13, 14, 15, 16, 17, 18, 19],
...: [21, 22, 23, 24, 25, 26, 27, 28, 29],
...: ]
...: )
...:
...: a = result[:, 0]
...
...: i = result[:, 8]
...: output = [[a, b], [d, e]]
In [277]: output
Out[277]:
[[array([ 1, 11, 21]), array([ 2, 12, 22])],
[array([ 4, 14, 24]), array([ 5, 15, 25])]]
In [278]: arr = np.array(output)
In [279]: arr
Out[279]:
array([[[ 1, 11, 21],
[ 2, 12, 22]],
[[ 4, 14, 24],
[ 5, 15, 25]]])
In [280]: arr.shape
Out[280]: (2, 2, 3)
In [281]: arr.T.shape
Out[281]: (3, 2, 2)
transpose exchanges the 1st and last dimensions.
A cleaner way to make a (N,2,2) array from selected columns is:
In [282]: arr = result[:,[0,1,3,4]].reshape(3,2,2)
In [283]: arr.shape
Out[283]: (3, 2, 2)
In [284]: arr
Out[284]:
array([[[ 1, 2],
[ 4, 5]],
[[11, 12],
[14, 15]],
[[21, 22],
[24, 25]]])
Since the last 2 dimensions are 2, you could transpose them, and take the difference:
In [285]: arr-arr.transpose(0,2,1)
Out[285]:
array([[[ 0, -2],
[ 2, 0]],
[[ 0, -2],
[ 2, 0]],
[[ 0, -2],
[ 2, 0]]])
Another way to get the (N,2,2) array is with a matrix index:
In [286]: result[:,[[0,1],[3,4]]]
Out[286]:
array([[[ 1, 2],
[ 4, 5]],
[[11, 12],
[14, 15]],
[[21, 22],
[24, 25]]])
Ok, this is not a coding problem, but a math problem. I wrote some code for you, since it's pretty obvious you're a beginner, so there will be some unfamiliar syntax in there that you should look into so you can avoid problems like this in the future. You might not use them all that often, but it's good to know how to do it, because it expands your understanding of python syntax in general.
First up, the complete code for easy copy and pasting:
import numpy as np
result=np.array([[1,2,3,4,5,6,7,8,9],
[11,12,13,14,15,16,17,18,19],
[21,22,23,24,25,26,27,28,29]])
output = np.array(tuple(result[:,i] for i in (0,1,3)))
def Matrix_Operation(Matrix,Coefficient):
if (Matrix.shape == Matrix.shape[::-1]
and isinstance(Matrix,np.ndarray)
and isinstance(Coefficient,float)):
return Coefficient*(Matrix-Matrix.transpose())
else:
print('The shape of you Matrix is not palindromic')
print('You cannot substitute matrices of unequal shape')
print('Your shape: %s'%str(Matrix.shape))
print(Matrix_Operation(output,0.5))
Now let's talk about a step by step explanation of what's happening here:
import numpy as np
result=np.array([[1,2,3,4,5,6,7,8,9],
[11,12,13,14,15,16,17,18,19],
[21,22,23,24,25,26,27,28,29]])
Python uses indentation (alignment of whitespaces) as an integral part of it's syntax. However, if you provide brackets, a lot of the time you don't need aligning indentations in order for the interpreter to understand your code. If you provide a large array of values manually, it is usually adviseable to start new lines at the commas (here, the commas separating the sublists). It's just more readable and that way your data isn't off screen in your coding program.
output = np.array(tuple(result[:,i] for i in (0,1,3)))
List comprehensions are a big deal in python and really handy for dirty one liners. As far as I know, no other language gives you this option. That's one of the reasons why python is so great. I basically created a list of lists, where each sublist is result[:,i] for every i in (0,1,3). This is cast as a tuple (yes, list comprehensions can also be done with tuples, not just lists). Finally I wrapped it in the np.array function, since this is the type required for our mathematical operations later on.
def Matrix_Operation(Matrix,Coefficient):
if (Matrix.shape == Matrix.shape[::-1]
and isinstance(Matrix,np.ndarray)
and isinstance(Coefficient,(float,int))):
return Coefficient*(Matrix-Matrix.transpose())
else:
print('The shape of you Matrix is not palindromic')
print('You cannot substitute matrices of unequal shape')
print('Your shape: %s'%str(Matrix.shape))
print(Matrix_Operation(output,0.5))
If you're gonna create a complex formula in python code, why not wrap it inside an abstractable function? You can incorporate a lot of "quality control" into a function as well, to check if it is given the correct input for the task it is supposed to do.
Your code failed, because you were trying to subtract a (2,2,3) shaped matrix from a (3,2,2) matrix. So we'll need a code snippet to check, if our provided matrix has a palindromic shape. You can reverse the order of items in a container by doing Container[::-1] and so we ask, if Matrix.shape == Matrix.shape[::-1]. Further, it is necessary, that our Matrix is a np.ndarray and if our coefficient is a number. That's what I'm doing with the isinstance() function. You can check for multiple types at once, which is why the isinstance(Coefficient,(float,int)) contains a tuple with both int and float in it.
Now that we have ensured that our input makes sense, we can preform our Matrix_Operation.
So in conclusion: Check if your math is solid before asking SO for help, because people here can get pretty annoyed at that sort of thing. You probably noticed by now that someone has already downvoted your question. Personally, I believe it's necessary to let newbies ask a couple stupid questions before they get into the groove, but that's what the voting button is for, I guess.

Python Numpy syntax: what does array index as two arrays separated by comma mean?

I don't understand array as index in Python Numpy.
For example, I have a 2d array A in Numpy
[[1,2,3]
[4,5,6]
[7,8,9]
[10,11,12]]
What does A[[1,3], [0,1]] mean?
Just test it for yourself!
A = np.arange(12).reshape(4,3)
print(A)
>>> array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
By slicing the array the way you did (docs to slicing), you'll get the first row, zero-th column element and the third row, first column element.
A[[1,3], [0,1]]
>>> array([ 3, 10])
I'd highly encourage you to play around with that a bit and have a look at the documentation and the examples.
Your are creating a new array:
import numpy as np
A = [[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]]
A = np.array(A)
print(A[[1, 3], [0, 1]])
# [ 4 11]
See Indexing, Slicing and Iterating in the tutorial.
Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by commas
Quoting the doc:
def f(x,y):
return 10*x+y
b = np.fromfunction(f, (5, 4), dtype=int)
print(b[2, 3])
# -> 23
You can also use a NumPy array as index of an array. See Index arrays in the doc.
NumPy arrays may be indexed with other arrays (or any other sequence- like object that can be converted to an array, such as lists, with the exception of tuples; see the end of this document for why this is). The use of index arrays ranges from simple, straightforward cases to complex, hard-to-understand cases. For all cases of index arrays, what is returned is a copy of the original data, not a view as one gets for slices.

How do I extract a set of given columns from a numPy array?

Given a numpy array such as:
x = array([[0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
How do I form a new array composed of the first and third columns?
To extract the first and third columns from the array use the following syntax:
x[:,[0,2]]
This means, take all rows, selecting only columns 0 and 2. Note that indexing in numPy arrays starts at zero.

Categories

Resources