Indexing a numpy array with a list of tuples - python

Why can't I index an ndarray using a list of tuple indices like so?
idx = [(x1, y1), ... (xn, yn)]
X[idx]
Instead I have to do something unwieldy like
idx2 = numpy.array(idx)
X[idx2[:, 0], idx2[:, 1]] # or more generally:
X[tuple(numpy.vsplit(idx2.T, 1)[0])]
Is there a simpler, more pythonic way?

You can use a list of tuples, but the convention is different from what you want. numpy expects a list of row indices, followed by a list of column values. You, apparently, want to specify a list of (x,y) pairs.
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing
The relevant section in the documentation is 'integer array indexing'.
Here's an example, seeking 3 points in a 2d array. (2 points in 2d can be confusing):
In [223]: idx
Out[223]: [(0, 1, 1), (2, 3, 0)]
In [224]: X[idx]
Out[224]: array([2, 7, 4])
Using your style of xy pairs of indices:
In [230]: idx1 = [(0,2),(1,3),(1,0)]
In [231]: [X[i] for i in idx1]
Out[231]: [2, 7, 4]
In [240]: X[tuple(np.array(idx1).T)]
Out[240]: array([2, 7, 4])
X[tuple(zip(*idx1))] is another way of doing the conversion. The tuple() is optional in Python2. zip(*...) is a Python idiom that reverses the nesting of a list of lists.
You are on the right track with:
In [242]: idx2=np.array(idx1)
In [243]: X[idx2[:,0], idx2[:,1]]
Out[243]: array([2, 7, 4])
My tuple() is just a bit more compact (and not necessarily more 'pythonic'). Given the numpy convention, some sort of conversion is necessary.
(Should we check what works with n-dimensions and m-points?)

Use a tuple of NumPy arrays which can be directly passed to index your array:
index = tuple(np.array(list(zip(*index_tuple))))
new_array = list(prev_array[index])

Related

How to get the index of np.maximum?

I know np.maximum computes the element-wise maximum, e.g.
>>> b = np.array([3, 6, 1])
>>> c = np.array([4, 2, 9])
>>> np.maximum(b, c)
array([4, 6, 9])
But is there any way to get the index as well? like in the above example, I also want something like this where each tuple denote (which array, index), it could be tuple or dictionary or something else. And also it would be great if it could work on 3d array, like the input two arrays are 3d arrays.
array([(1, 0), (0, 1), (1, 2)])
You could stack the two 1d-arrays to get a 2d-array and use argmax:
arr = np.vstack((b, c))
indices = np.argmax(arr, axis=0)
This will give you a list of integers, not tuples, but as you know that you compare per column, the last elements of each tuple are unnecessary anyway. They are just ascending integers starting at 0. If you really need them, though, you could just add
indices = list(zip(indices, range(len(b)))

How to one liner access numpy array in a list?

Given a array in list
import numpy as np
n_pair = 5
np.random.seed ( 0 )
nsteps = 4
nmethod = 2
nbands = 3
t_band=0
t_method=0
t_step=0
t_sbj=0
t_gtmethod=1
all_sub = [[np.random.rand ( nmethod, nbands, 2 ) for _ in range ( nsteps )] for _ in range ( 3)]
Then extract the array data point from each of the list as below
this_gtmethod=[x[t_step][t_method][t_band][t_gtmethod] for x in all_sub]
However, I would like to avoid the loop and instead would like to access directly all the three elements as below
this_gtmethod=all_sub[:][t_step][t_method][t_band][t_gtmethod]
But, it does not return the expected result when indexing the element as above
May I know where did I do wrong?
This sort of slicing and indexing is best accomplished with Numpy arrays rather than lists.
If you make all_sub into a Numpy array, you can achieve your desired result with simple slicing.
all_sub = np.array(all_sub)
this_gtmethod = all_sub[:, t_step, t_method, t_band, t_gtmethod]
The result is the same as with your looping example.
You made a list of lists of arrays:
In [279]: type(all_sub), len(all_sub)
Out[279]: (list, 3)
In [280]: type(all_sub[0]), len(all_sub[0])
Out[280]: (list, 4)
In [282]: type(all_sub[0][0]), all_sub[0][0].shape
Out[282]: (numpy.ndarray, (2, 3, 2))
Lists can only be indexed with a scalar value or slice. List comprehension is the normal way of iterating through a list.
But an array can be indexed several dimensions at a time:
In [283]: all_sub[0][1][1,2,:]
Out[283]: array([0.46147936, 0.78052918])
Since the nested lists are all the same size, and arrays the same, it can be turned into a multidimensional array:
In [284]: M = np.array(all_sub)
In [285]: M.shape
Out[285]: (3, 4, 2, 3, 2)
2 ways of accessing the same subarrays:
In [286]: M[:,0,0,0,:]
Out[286]:
array([[0.5488135 , 0.71518937],
[0.31542835, 0.36371077],
[0.58651293, 0.02010755]])
In [287]: [a[0][0,0,:] for a in all_sub]
Out[287]:
[array([0.5488135 , 0.71518937]),
array([0.31542835, 0.36371077]),
array([0.58651293, 0.02010755])]

Calculation on list of numpy array

I'm trying to do some calculation (mean, sum, etc.) on a list containing numpy arrays.
For example:
list = [array([2, 3, 4]),array([4, 4, 4]),array([6, 5, 4])]
How can retrieve the mean (for example) ?
In a list like [4,4,4] or a numpy array like array([4,4,4]) ?
Thanks in advance for your help!
EDIT : Sorry, I didn't explain properly what I was aiming to do : I would like to get the mean of i-th index of the arrays. For example, for index 0 :
(2+4+6)/3 = 4
I don't want this :
(2+3+4)/3 = 3
Therefore the end result will be
[4,4,4] / and not [3,4,5]
If L were a list of scalars then calculating the mean could be done using the straight forward expression:
sum(L) / len(L)
Luckily, this works unchanged on lists of arrays:
L = [np.array([2, 3, 4]), np.array([4, 4, 4]), np.array([6, 5, 4])]
sum(L) / len(L)
# array([4., 4., 4.])
For this example this happens to be quitea bit faster than the numpy function
np.mean
timeit(lambda: np.mean(L, axis=0))
# 13.708808058872819
timeit(lambda: sum(L) / len(L))
# 3.4780975924804807
You can use a for loop and iterate through the elements of your array, if your list is not too big:
mean = []
for i in range(len(list)):
mean.append(np.mean(list[i]))
Given a 1d array a, np.mean(a) should do the trick.
If you have a 2d array and want the means for each one separately, specify np.mean(a, axis=1).
There are equivalent functions for np.sum, etc.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html
https://docs.scipy.org/doc/numpy/reference/generated/numpy.sum.html
You can use map
import numpy as np
my_list = [np.array([2, 3, 4]),np.array([4, 4, 4]),np.array([6, 5, 4])]
np.mean(my_list,axis=0) #[4,4,4]
Note: Do not name your variable as list as it will shadow the built-ins

Custom slicing in numpy arrays (get specific elements, then every n-th) possible?

I'm in need of a more customized way to extract given elements from a numpy array than the general indexing seems to allow me. In particular, I want to get a number of arbitrary, predefined elements, then every n-th, starting at a given point.
Say, e.g., I want the second (as in index number 2) and fourth element of an array, and then, every third element, beginning from the sixth one. So far, I'm doing:
newArray = np.concatenate(myArray[(2, 4)], myArray[6::3])
Is there a more convenient way to achieve this?
It's effectively identical to what you're doing, but you might find it a bit more convenient to do:
new_array = my_array[np.r_[2, 4, 6:len(my_array):3]]
np.r_ is basically concatenation + arange-like slicing.
For example:
In [1]: import numpy as np
In [2]: np.r_[np.arange(5), np.arange(1, 4)]
Out[2]: array([0, 1, 2, 3, 4, 1, 2, 3])
In [3]: np.r_[1, 2, :5]
Out[3]: array([1, 2, 0, 1, 2, 3, 4])
In [4]: np.r_[:5]
Out[4]: array([0, 1, 2, 3, 4])
The downside to this approach is that you're building up an (potentially very large) additional indexing array. In either case, you're going to wind up creating a copy, but if my_array is very large, your original approach is more efficient.
np.r_ is a bit unreadable (meant for interactive use), but it can be a very handy way of building up arbitrary indexing arrays.

Get original indices of a sorted Numpy array

I have an array of distances a = np.array([20.5 ,5.3 ,60.7 ,3.0 ], 'double') and I need the indices of the sorted array (for example [3, 1, 0, 2], for a.sort()). Is there a function in Numpy to do that?
Yes, there's the x = numpy.argsort(a) function or x = numpy.ndarray.argsort(a) method. It does exactly what you're asking for. You can also call argsort as a method on an ndarray object like so: a.argsort().
Here's a link to the documentation: http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html#numpy.argsort
Here's an example, for reference and convenience:
# create an array
a = np.array([5,2,3])
# np.sort - returns the array, sorted
np.sort(a)
>>> array([2, 3, 5])
# argsort - returns the original indexes of the sorted array
np.argsort(a)
>>> array([1, 2, 0])

Categories

Resources