Indexing numpy multidimensional arrays depends on a slicing method - python

I have a 3-D array. When I take a 2-D slice of it the result depends on whether it is indexed with a list or with a slice. In the first case the result is transposed. Didn't find this behaviour in the manual.
>>> import numpy as np
>>> x = np.array([[[1,1,1],[2,2,2]], [[3,3,3],[4,4,4]]])
>>> x
array([[[1, 1, 1],
[2, 2, 2]],
[[3, 3, 3],
[4, 4, 4]]])
>>> x[0,:,[0,1]]
array([[1, 2],
[1, 2]])
>>> x[0,:,slice(2)]
array([[1, 1],
[2, 2]])
>>>
Could anyone point a rationale for this?

Because you are actually using advanced indexing when you use [0,1]. From the docs:
Combining advanced and basic indexing When there is at least one
slice (:), ellipsis (...) or np.newaxis in the index (or the array has
more dimensions than there are advanced indexes), then the behaviour
can be more complicated. It is like concatenating the indexing result
for each advanced index element
In the simplest case, there is only a single advanced index. A single
advanced index can for example replace a slice and the result array
will be the same, however, it is a copy and may have a different
memory layout. A slice is preferable when it is possible.
Pay attention to the two parts I've bolded above.
In particular, in this construction:
>>> x[0,:,[0,1]]
array([[1, 2],
[1, 2]])
Is the case where there is at least once "slice, ellipsisi, or np.newaxis" in the index, and the behavior is like concatenating the indexing result for each advanced index element. So:
>>> x[0,:,[0]]
array([[1, 2]])
>>> x[0,:,[1]]
array([[1, 2]])
>>> np.concatenate((x[0,:,[0]], x[0,:,[1]]))
array([[1, 2],
[1, 2]])
However, this construction is like the simple case: there is only a single advanced index, so it acts like a slice:
>>> x[0,:,slice(2)]
array([[1, 1],
[2, 2]])
>>> x[slice(0,1),:,slice(2)]
array([[[1, 1],
[2, 2]]])
Although note, that the later is actually three dimensional because the first part of the index acted as a slice, it's 3 slices so three dimensions.

As I understand it, NumPy is following the axis numbering philosophy when it spits out the result when given a list/tuple-like index.
array([[[1, 1, 1],
[2, 2, 2]],
[[3, 3, 3],
[4, 4, 4]]])
When you already specify the first two indices (x[0, :, ]), now the next question is how to extract the third dimension. Now, when you specify a tuple (0,1), it first extracts the 0th slice axis wise, so it gets [1, 2] since it lies in 0th axis, next it extracts 1st slice likewise and stacks below the already existing row [1, 2].
[[1, 1, 1], array([[1, 2],
[2, 2, 2]] =====> [1, 2]])
(visualize this stacking as below (not on top of) the already existing row since axis-0 grows downwards)
Alternatively, it is following the slicing philosophy (start:stop:step) when slice(n) is given for the index. Note that using slice(2) is essentially equal to 0:2 in your example. So, it extracts [1, 1] first, then [2, 2]. Note, here to how [1, 1] comes on top of [2, 2], again following the same axis philosophy here since we didn't leave the third dimension yet. This is why this result is the transpose of the other.
array([[1, 1],
[2, 2]])
Also, note that starting from 3-D arrays this consistency is preserved. Below is an example from 4-D array and the slicing results.
In [327]: xa
Out[327]:
array([[[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]]],
[[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]],
[[27, 28, 29],
[30, 31, 32],
[33, 34, 35]]]])
In [328]: xa[0, 0, :, [0, 1]]
Out[328]:
array([[0, 3, 6],
[1, 4, 7]])
In [329]: xa[0, 0, :, 0:2]
Out[329]:
array([[0, 1],
[3, 4],
[6, 7]])

Related

axis in numpy.ufunc.outer

numpy.ufunc.outer is like Mathematica Outer[], but what I can't seem to figure out is how to treat a two dimensional array as a one dimensional array of one dimensional arrays. That is, suppose
a = [[1, 2], [3, 4]] and b = [[4, 5], [6, 7]]. I want to compute the two dimensional array whose ijth element is the distance (however I define it) between the ith row of a and the jth row of b, so in this case, if we use the supnorm distance, the result will be [[3, 5], [1, 3]]. Obviously one can write a loop, but that seems morally wrong, and precisely what ufunc.outer is meant to avoid.
In [309]: a = np.array([[1, 2], [3, 4]]); b = np.array([[4, 5], [6, 7]])
With broadcasting we can take the row differences:
In [310]: a[:,None,:]-b[None,:,:]
Out[310]:
array([[[-3, -3],
[-5, -5]],
[[-1, -1],
[-3, -3]]])
and reduce those with the max/abs on the last axis (I think that's what you mean by the sup norm:
In [311]: np.abs(a[:,None,:]-b[None,:,:]).max(axis=-1)
Out[311]:
array([[3, 5],
[1, 3]])
With subtract.outer, I have to select a subset of the results, and then transpose:
In [318]: np.subtract.outer(a,b)[:,[0,1],:,[0,1]].transpose(2,1,0)
Out[318]:
array([[[-3, -3],
[-1, -1]],
[[-5, -5],
[-3, -3]]])
I don't see any axis controls in the outer docs. Since broadcasting gives finer control, I haven't seen much use of the ufunc.outer feature.

Numpy Indexing and creation of new array

I am trying to understand what is going on inside the print statement.
I know indexing is taking place to create a 2D array however, I don't understand the order
x = np.arange(0,2*np.pi,0.001)
X = np.concatenate(([x], [np.ones(y.shape[0])]), axis=0)
print(X[[[0,1],[0,1]],[[0,0],[-1,-1]]])
Produces:
array([[0. , 1. ],
[6.283, 1. ]])
I looked into documentation and I think following example from there should explain that indexing:
From a 4x3 array the corner elements should be selected using advanced indexing. Thus all elements for which the column is one of [0, 2] and the row is one of [0, 3] need to be selected. To use advanced indexing one needs to select all elements explicitly. Using the method explained previously one could write:
x = array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8],
[9, 10, 11]])
rows = np.array([[0, 0],
[3, 3]], dtype=np.intp)
columns = np.array([[0, 2],
[0, 2]], dtype=np.intp)
x[rows, columns]
last line give
array([[0, 2],
[9, 11]])
code in your question seems to do same operation (but with other values) as that example, but with slamming indices directly rather than creating rows and columns.
X[[[0, 1], [0, 1]], [[0, 0], [-1, -1]]] might be readed as get elements which are, counting from 0: (in 0th or 1st row) and (in 0th or last column)

Remove duplicate tuples in numpy array (ones directly next to each other)

I am more or less new to python/numpy and I have this problem:
I have numpy arrays in which the first and last tuples are always the same. In between, there are sometimes duplicate tuples (only the ones directly next to each other) that I want to get rid of. The used parenthesis structure should be maintained.
I tried np.unique already (e.g. 1, 2), but it changes my original order (which has to be maintained). My sample array looks like this:
myarray = np.array([[[1,1],[1,1],[4,4],[4,4],[2,2],[3,3],[1,1]]])
I need a result that looks like this:
myarray = np.array([[[1,1],[4,4],[2,2],[3,3],[1,1]]])
Thank you in advance for your support!
Get the one-off offsetted comparisons along the second axis and use boolean-indexing to select -
myarray[:,np.r_[True,(myarray[0,1:] != myarray[0,:-1]).any(-1)]]
Sample run -
In [42]: myarray
Out[42]:
array([[[1, 1],
[1, 1],
[4, 4],
[4, 4],
[2, 2],
[3, 3],
[1, 1]]])
In [43]: myarray[:,np.r_[True,(myarray[0,1:] != myarray[0,:-1]).any(-1)]]
Out[43]:
array([[[1, 1],
[4, 4],
[2, 2],
[3, 3],
[1, 1]]])
Or with equality comparison and then look for ALL matches -
In [47]: myarray[:,np.r_[True,~((myarray[0,1:] == myarray[0,:-1]).all(-1))]]
Out[47]:
array([[[1, 1],
[4, 4],
[2, 2],
[3, 3],
[1, 1]]])

Efficiently change order of numpy array

I have a 3 dimensional numpy array. The dimension can go up to 128 x 64 x 8192. What I want to do is to change the order in the first dimension by interchanging pairwise.
The only idea I had so far is to create a list of the indices in the correct order.
order = [1,0,3,2...127,126]
data_new = data[order]
I fear, that this is not very efficient but I have no better idea so far
You could reshape to split the first axis into two axes, such that latter of those axes is of length 2 and then flip the array along that axis with [::-1] and finally reshape back to original shape.
Thus, we would have an implementation like so -
a.reshape(-1,2,*a.shape[1:])[:,::-1].reshape(a.shape)
Sample run -
In [170]: a = np.random.randint(0,9,(6,3))
In [171]: order = [1,0,3,2,5,4]
In [172]: a[order]
Out[172]:
array([[0, 8, 5],
[4, 5, 6],
[0, 0, 2],
[7, 3, 8],
[1, 6, 3],
[2, 4, 4]])
In [173]: a.reshape(-1,2,*a.shape[1:])[:,::-1].reshape(a.shape)
Out[173]:
array([[0, 8, 5],
[4, 5, 6],
[0, 0, 2],
[7, 3, 8],
[1, 6, 3],
[2, 4, 4]])
Alternatively, if you are looking to efficiently create those constantly flipping indices order, we could do something like this -
order = np.arange(data.shape[0]).reshape(-1,2)[:,::-1].ravel()

Python Matrix sorting via one column

I have a n x 2 matrix of integers. The first column is a series 0,1,-1,2,-2, however these are in the order that they were compiled in from their constituent matrices. The second column is a list of indices from another list.
I would like to sort the matrix via this second column. This would be equivalent to selecting two columns of data in Excel, and sorting via Column B (where the data is in columns A and B). Keep in mind, the adjacent data in the first column of each row should be kept with its respective second column counterpart. I have looked at solutions using the following:
data[np.argsort(data[:, 0])]
But this does not seem to work. The matrix in question looks like this:
matrix([[1, 1],
[1, 3],
[1, 7],
...,
[2, 1021],
[2, 1040],
[2, 1052]])
You could use np.lexsort:
numpy.lexsort(keys, axis=-1)
Perform an indirect sort using a sequence of keys.
Given multiple sorting keys, which can be interpreted as columns in a
spreadsheet, lexsort returns an array of integer indices that
describes the sort order by multiple columns.
In [13]: data = np.matrix(np.arange(10)[::-1].reshape(-1,2))
In [14]: data
Out[14]:
matrix([[9, 8],
[7, 6],
[5, 4],
[3, 2],
[1, 0]])
In [15]: temp = data.view(np.ndarray)
In [16]: np.lexsort((temp[:, 1], ))
Out[16]: array([4, 3, 2, 1, 0])
In [17]: temp[np.lexsort((temp[:, 1], ))]
Out[17]:
array([[1, 0],
[3, 2],
[5, 4],
[7, 6],
[9, 8]])
Note if you pass more than one key to np.lexsort, the last key is the primary key. The next to last key is the second key, and so on.
Using np.lexsort as I show above requires the use of a temporary array because np.lexsort does not work on numpy matrices. Since
temp = data.view(np.ndarray) creates a view, rather than a copy of data, it does not require much extra memory. However,
temp[np.lexsort((temp[:, 1], ))]
is a new array, which does require more memory.
There is also a way to sort by columns in-place. The idea is to view the array as a structured array with two columns. Unlike plain ndarrays, structured arrays have a sort method which allows you to specify columns as keys:
In [65]: data.dtype
Out[65]: dtype('int32')
In [66]: temp2 = data.ravel().view('int32, int32')
In [67]: temp2.sort(order = ['f1', 'f0'])
Notice that since temp2 is a view of data, it does not require allocating new memory and copying the array. Also, sorting temp2 modifies data at the same time:
In [69]: data
Out[69]:
matrix([[1, 0],
[3, 2],
[5, 4],
[7, 6],
[9, 8]])
You had the right idea, just off by a few characters:
>>> import numpy as np
>>> data = np.matrix([[9, 8],
... [7, 6],
... [5, 4],
... [3, 2],
... [1, 0]])
>>> data[np.argsort(data.A[:, 1])]
matrix([[1, 0],
[3, 2],
[5, 4],
[7, 6],
[9, 8]])

Categories

Resources