I know np.maximum computes the element-wise maximum, e.g.
>>> b = np.array([3, 6, 1])
>>> c = np.array([4, 2, 9])
>>> np.maximum(b, c)
array([4, 6, 9])
But is there any way to get the index as well? like in the above example, I also want something like this where each tuple denote (which array, index), it could be tuple or dictionary or something else. And also it would be great if it could work on 3d array, like the input two arrays are 3d arrays.
array([(1, 0), (0, 1), (1, 2)])
You could stack the two 1d-arrays to get a 2d-array and use argmax:
arr = np.vstack((b, c))
indices = np.argmax(arr, axis=0)
This will give you a list of integers, not tuples, but as you know that you compare per column, the last elements of each tuple are unnecessary anyway. They are just ascending integers starting at 0. If you really need them, though, you could just add
indices = list(zip(indices, range(len(b)))
Related
I try to index an array (has five dimensions) using a list. However, under certain situation, the array is permuted.
Say, a has the shape of (3,4,5,6,7), i.e.,
>>> a = np.zeros((3,4,5,6,7))
>>> a.shape
(3, 4, 5, 6, 7)
Using a list to index this array on the third dimension, it looks normal:
>>> a[:,:,[0,3],:,:].shape
(3, 4, 2, 6, 7)
However, if the array were indexed under the following situation, the third dimension is permuted to the leftmost:
>>> a[0,:,[0,1],:,:].shape
(2, 4, 6, 7)
Can anyone shed some light on it?
Basic Slicing:-
Basic Slicing occurs when a slice object is used.Usually a slice object is constructed as array[(start:stop:step)]. Ellipsis and newaxis also comes under this.
Example:- 1D array
>>x=np.arange(10)
>>x[2:10:3]
array([2, 5, 8])
Example:- 2D array
>>>x = np.array([[1,2,3], [4,5,6]])
>>>x[1:2]
array([[4, 5, 6]])
Example:- 3D array
>>>x = np.array([[[1],[2],[3]], [[4],[5],[6]]])
>>> x[0:1]
array([[[1],
[2],
[3]]])
In the above example the number of slices(obj) given is less than that of the total number of dimension of the array. If the number of objects in the selection tuple is less than N, then it is assumed for any subsequent dimensions.
Advanced Slicing:-
Advanced indexing is triggered when the selection object, obj,
is a non-tuple sequence object,
an ndarray (of data type integer or bool),
a tuple with at least one sequence object or ndarray (of data type integer or bool).
There are two types of advanced indexing: Integer and Boolean.
Integer Indexing:-
Integer array indexing allows selection of arbitrary items in the array based on their N-dimensional index. Each integer array represents a number of indexes into that dimension.
When the index consists of as many integer arrays as the array being indexed has dimensions, the indexing is straight forward, but different from slicing.
Example:-
>>a = np.array([[1,2,3],[4,5,6],[7,8,9]])
>>a[[0,1,2],[0,1,1]]
array([1, 5, 8])
The above example prints:
a[0,0],a[1,0],a[2,1]
Remember:- So Integer Indexing maps between two indexes.
Now to your question:-
>>>a=np.array([3,4,5])
>>>a[0,:,[0,1]]
First Case:-
This is of the form x[arr1,:,arr2].
arr1 and arr2 are advanced indexes.We consider 0 also to be an advanced index.
If the advanced indexes are separated by a slice, Ellipsis or newaxis then the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that.
This essentially means that the dimension of [0,1] comes first in the array. I am leaving off 0 as it has no dimension.
>>>a[0,:,[0,1]].shape
(2,4)
Second case:-
This is of the form x[:,:,arr1]. Here only arr1 is advanced index.
If the advanced indexes are all next to each other then the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array.
This essentially means that the dimension of [0,1] comes at its respective position specified in the index of the array.
>>>a[0:1,:,[0,1]].shape
(1,4,2)
[0,1] has shape(2,) and since it occurs at third index it is inserted into 3rd index of the result array.
Any suggestions and improvements are Welcome.
Reference:-
Numpy_Docs
Thanks #Hari_Sheldon for the reply. Now, I've seen what print has done to the array a, but I still do not understand why Python takes those columns specified by a list and puts them as rows at the leftmost position. Is there any reference out there to explain the reason?
And, under some situations, this dimension permutation does not occur, i.e.:
>>> a[0:1,:,[0,3]].shape
(1, 4, 2)
As you can see, instead of permuting it into (2, 4), the dimensional order remains!
I have a multidimensional array a:
a = np.random.uniform(1,10,(2,4,2,3,10,10))
For dimensions 4-6, I have 3 lists which contain the indexes for slicing that dimension of array 'a'
dim4 = [0,2]
dim5 = [3,5,9]
dim6 = [1,2,7,8]
How do I slice out array 'a' such that i get:
b = a[0,:,0,dim4,dim5,dim6]
So b should be an array with shape (4,2,3,4), and containing elements from the corresponding dimensions of a. When I try the code above, I get an error saying that different shapes can't be broadcast together for axis 4-6, but if I were to do:
b = a[0,:,0:2,0:3,0:4]
then it does work, even though the slicing lists all have different lengths. So how do you slice multidimensional arrays with non adjacent indexes?
You can use the numpy.ix_ function to construct complex indexing like this. It takes a sequence of array_like, and makes an "open mesh" from them. The example from the docstring is pretty clear:
Using ix_ one can quickly construct index arrays that will index
the cross product. a[np.ix_([1,3],[2,5])] returns the array
[[a[1,2] a[1,5]], [a[3,2] a[3,5]]].
So, for your data, you'd do:
>>> indices = np.ix_((0,), np.arange(a.shape[1]), (0,), dim4, dim5, dim6)
>>> a[indices].shape
(1, 4, 1, 2, 3, 4)
Get rid of the size-1 dimensions with np.squeeze:
>>> np.squeeze(a[indices]).shape
(4, 2, 3, 4)
I have an array:
a = [1, 3, 5, 7, 29 ... 5030, 6000]
This array gets created from a previous process, and the length of the array could be different (it is depending on user input).
I also have an array:
b = [3, 15, 67, 78, 138]
(Which could also be completely different)
I want to use the array b to slice the array a into multiple arrays.
More specifically, I want the result arrays to be:
array1 = a[:3]
array2 = a[3:15]
...
arrayn = a[138:]
Where n = len(b).
My first thought was to create a 2D array slices with dimension (len(b), something). However we don't know this something beforehand so I assigned it the value len(a) as that is the maximum amount of numbers that it could contain.
I have this code:
slices = np.zeros((len(b), len(a)))
for i in range(1, len(b)):
slices[i] = a[b[i-1]:b[i]]
But I get this error:
ValueError: could not broadcast input array from shape (518) into shape (2253412)
You can use numpy.split:
np.split(a, b)
Example:
np.split(np.arange(10), [3,5])
# [array([0, 1, 2]), array([3, 4]), array([5, 6, 7, 8, 9])]
b.insert(0,0)
result = []
for i in range(1,len(b)):
sub_list = a[b[i-1]:b[i]]
result.append(sub_list)
result.append(a[b[-1]:])
You are getting the error because you are attempting to create a ragged array. This is not allowed in numpy.
An improvement on #Bohdan's answer:
from itertools import zip_longest
result = [a[start:end] for start, end in zip_longest(np.r_[0, b], b)]
The trick here is that zip_longest makes the final slice go from b[-1] to None, which is equivalent to a[b[-1]:], removing the need for special processing of the last element.
Please do not select this. This is just a thing I added for fun. The "correct" answer is #Psidom's answer.
Why can't I index an ndarray using a list of tuple indices like so?
idx = [(x1, y1), ... (xn, yn)]
X[idx]
Instead I have to do something unwieldy like
idx2 = numpy.array(idx)
X[idx2[:, 0], idx2[:, 1]] # or more generally:
X[tuple(numpy.vsplit(idx2.T, 1)[0])]
Is there a simpler, more pythonic way?
You can use a list of tuples, but the convention is different from what you want. numpy expects a list of row indices, followed by a list of column values. You, apparently, want to specify a list of (x,y) pairs.
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing
The relevant section in the documentation is 'integer array indexing'.
Here's an example, seeking 3 points in a 2d array. (2 points in 2d can be confusing):
In [223]: idx
Out[223]: [(0, 1, 1), (2, 3, 0)]
In [224]: X[idx]
Out[224]: array([2, 7, 4])
Using your style of xy pairs of indices:
In [230]: idx1 = [(0,2),(1,3),(1,0)]
In [231]: [X[i] for i in idx1]
Out[231]: [2, 7, 4]
In [240]: X[tuple(np.array(idx1).T)]
Out[240]: array([2, 7, 4])
X[tuple(zip(*idx1))] is another way of doing the conversion. The tuple() is optional in Python2. zip(*...) is a Python idiom that reverses the nesting of a list of lists.
You are on the right track with:
In [242]: idx2=np.array(idx1)
In [243]: X[idx2[:,0], idx2[:,1]]
Out[243]: array([2, 7, 4])
My tuple() is just a bit more compact (and not necessarily more 'pythonic'). Given the numpy convention, some sort of conversion is necessary.
(Should we check what works with n-dimensions and m-points?)
Use a tuple of NumPy arrays which can be directly passed to index your array:
index = tuple(np.array(list(zip(*index_tuple))))
new_array = list(prev_array[index])
Say I have a 3 dimensional numpy array:
np.random.seed(1145)
A = np.random.random((5,5,5))
and I have two lists of indices corresponding to the 2nd and 3rd dimensions:
second = [1,2]
third = [3,4]
and I want to select the elements in the numpy array corresponding to
A[:][second][third]
so the shape of the sliced array would be (5,2,2) and
A[:][second][third].flatten()
would be equivalent to to:
In [226]:
for i in range(5):
for j in second:
for k in third:
print A[i][j][k]
0.556091074129
0.622016249651
0.622530505868
0.914954716368
0.729005532319
0.253214472335
0.892869371179
0.98279375528
0.814240066639
0.986060321906
0.829987410941
0.776715489939
0.404772469431
0.204696635072
0.190891168574
0.869554447412
0.364076117846
0.04760811817
0.440210532601
0.981601369658
Is there a way to slice a numpy array in this way? So far when I try A[:][second][third] I get IndexError: index 3 is out of bounds for axis 0 with size 2 because the [:] for the first dimension seems to be ignored.
Numpy uses multiple indexing, so instead of A[1][2][3], you can--and should--use A[1,2,3].
You might then think you could do A[:, second, third], but the numpy indices are broadcast, and broadcasting second and third (two one-dimensional sequences) ends up being the numpy equivalent of zip, so the result has shape (5, 2).
What you really want is to index with, in effect, the outer product of second and third. You can do this with broadcasting by making one of them, say second into a two-dimensional array with shape (2,1). Then the shape that results from broadcasting second and third together is (2,2).
For example:
In [8]: import numpy as np
In [9]: a = np.arange(125).reshape(5,5,5)
In [10]: second = [1,2]
In [11]: third = [3,4]
In [12]: s = a[:, np.array(second).reshape(-1,1), third]
In [13]: s.shape
Out[13]: (5, 2, 2)
Note that, in this specific example, the values in second and third are sequential. If that is typical, you can simply use slices:
In [14]: s2 = a[:, 1:3, 3:5]
In [15]: s2.shape
Out[15]: (5, 2, 2)
In [16]: np.all(s == s2)
Out[16]: True
There are a couple very important difference in those two methods.
The first method would also work with indices that are not equivalent to slices. For example, it would work if second = [0, 2, 3]. (Sometimes you'll see this style of indexing referred to as "fancy indexing".)
In the first method (using broadcasting and "fancy indexing"), the data is a copy of the original array. In the second method (using only slices), the array s2 is a view into the same block of memory used by a. An in-place change in one will change them both.
One way would be to use np.ix_:
>>> out = A[np.ix_(range(A.shape[0]),second, third)]
>>> out.shape
(5, 2, 2)
>>> manual = [A[i,j,k] for i in range(5) for j in second for k in third]
>>> (out.ravel() == manual).all()
True
Downside is that you have to specify the missing coordinate ranges explicitly, but you could wrap that into a function.
I think there are three problems with your approach:
Both second and third should be slices
Since the 'to' index is exclusive, they should go from 1 to 3 and from 3 to 5
Instead of A[:][second][third], you should use A[:,second,third]
Try this:
>>> np.random.seed(1145)
>>> A = np.random.random((5,5,5))
>>> second = slice(1,3)
>>> third = slice(3,5)
>>> A[:,second,third].shape
(5, 2, 2)
>>> A[:,second,third].flatten()
array([ 0.43285482, 0.80820122, 0.64878266, 0.62689481, 0.01298507,
0.42112921, 0.23104051, 0.34601169, 0.24838564, 0.66162209,
0.96115751, 0.07338851, 0.33109539, 0.55168356, 0.33925748,
0.2353348 , 0.91254398, 0.44692211, 0.60975602, 0.64610556])