I have a 3D array and I need to "squeeze" it over the last axis, so that I get a 2D array. I need to do it in the following way. For each values of the indices for the first two dimensions I know the value of the index for the 3rd dimension from where the value should be taken.
For example, I know that if i1 == 2 and i2 == 7 then i3 == 11. It means that out[2,7] = inp[2,7,11]. This mapping from first two dimensions into the third one is given in another 2D array. In other words, I have an array in which on the position 2,7 I have 11 as a value.
So, my question is how to combine these two array (3D and 2D) to get the output array (2D).
In [635]: arr = np.arange(24).reshape(2,3,4)
In [636]: idx = np.array([[1,2,3],[0,1,2]])
In [637]: I,J = np.ogrid[:2,:3]
In [638]: arr[I,J,idx]
Out[638]:
array([[ 1, 6, 11],
[12, 17, 22]])
In [639]: arr
Out[639]:
array([[[ 0, 1, 2, 3], # 1
[ 4, 5, 6, 7], # 6
[ 8, 9, 10, 11]], # ll
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
I,J broadcast together to select a (2,3) set of values, matching idx:
In [640]: I
Out[640]:
array([[0],
[1]])
In [641]: J
Out[641]: array([[0, 1, 2]])
This is a generalization to 3d of the easier 2d problem - selecting one item from each row:
In [649]: idx
Out[649]:
array([[1, 2, 3],
[0, 1, 2]])
In [650]: idx[np.arange(2), [0,1]]
Out[650]: array([1, 1])
In fact we could convert the 3d problem into a 2d one:
In [655]: arr.reshape(6,4)[np.arange(6), idx.ravel()]
Out[655]: array([ 1, 6, 11, 12, 17, 22])
Generalizing the original case:
In [55]: arr = np.arange(24).reshape(2,3,4)
In [56]: idx = np.array([[1,2,3],[0,1,2]])
In [57]: IJ = np.ogrid[[slice(i) for i in idx.shape]]
In [58]: IJ
Out[58]:
[array([[0],
[1]]), array([[0, 1, 2]])]
In [59]: (*IJ,idx)
Out[59]:
(array([[0],
[1]]), array([[0, 1, 2]]), array([[1, 2, 3],
[0, 1, 2]]))
In [60]: arr[_]
Out[60]:
array([[ 1, 6, 11],
[12, 17, 22]])
The key is in combining the IJ list of arrays with the idx to make a new indexing tuple. Constructing the tuple is a little messier if idx isn't the last index, but it's still possible. E.g.
In [61]: (*IJ[:-1],idx,IJ[-1])
Out[61]:
(array([[0],
[1]]), array([[1, 2, 3],
[0, 1, 2]]), array([[0, 1, 2]]))
In [62]: arr.transpose(0,2,1)[_]
Out[62]:
array([[ 1, 6, 11],
[12, 17, 22]])
Of if it's easier transpose arr to the idx dimension is last. The key is that the index operation takes a tuple of index arrays, arrays which broadcast against each other to select specific items.
That's what ogrid is doing, create the arrays that work with idx.
inp = np.random.random((20, 10, 5)) # simulate some input
i1, i2 = np.indices(inp.shape[:2])
i3 = np.random.randint(0, 5, size=inp.shape) # or implement whatever mapping
# you want between (i1,i2) and i3
out = inp[(i1, i2, i3)]
See https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing for more details
Using numpy.einsum
This can be achieved by a combination of array indexing and usage of numpy.einsum:
>>> numpy.einsum('ijij->ij', inp[:, :, indices])
inp[:, :, indices] creates a four-dimensional array where for each of the first two indices (the first two dimensions) all indices of the index array are applied to the third dimension. Because the index array is two-dimensional this results in 4D. However you only want those indices of the index array which correspond to the ones of the first two dimensions. This is then achieved by using the string ijij->ij. This tells einsum that you want to select only those elements where the indices of 1st and 3rd and 2nd and 4th axis are similar. Because the last two dimensions (3rd and 4th) were added by the index array this is similar to selecting only the index index[i, j] for the third dimension of inp.
Note that this method can really blow up the memory consumption. Especially if inp.shape[:2] is much greater than inp.shape[2] then inp[:, :, indices].size will be approximately inp.size ** 2.
Building the indices manually
First we prepare the new index array:
>>> idx = numpy.array(list(
... numpy.ndindex(*inp.shape[:2], 1) # Python 3 syntax
... ))
Then we update the column which corresponds to the third axis:
>>> idx[:, 2] = indices[idx[:, 0], idx[:, 1]]
Now we can select the elements and simply reshape the result:
>>> inp[tuple(idx.T)].reshape(*inp.shape[:2])
Using numpy.choose
Note: numpy.choose allows a maximum size of 32 for the axis which is chosen from.
According to this answer and the documentation of numpy.choose we can also use the following:
# First we need to bring the last axis to the front because
# `numpy.choose` chooses from the first axis.
>>> new_inp = numpy.moveaxis(inp, -1, 0)
# Now we can select the elements.
>>> numpy.choose(indices, new_inp)
Although the documentation discourages the use of a single array for the 2nd argument (the choices)
To reduce the chance of misinterpretation, even though the following “abuse” is nominally supported, choices should neither be, nor be thought of as, a single array, i.e., the outermost sequence-like container should be either a list or a tuple.
this seems to be the case only for preventing misunderstandings:
choices : sequence of arrays
Choice arrays. a and all of the choices must be broadcastable to the same shape. If choices is itself an array (not recommended), then its outermost dimension (i.e., the one corresponding to choices.shape[0]) is taken as defining the “sequence”.
So from my point of view there's nothing wrong with using numpy.choose that way, as long as one is aware of what they're doing.
I believe this should do it:
for i in range(n):
for j in range(m):
k = index_mapper[i][j]
value = input_3d[i][j][k]
out_2d[i][j] = value
Related
As the title says, how do I assign multiple rows and columns of one array to the same rows and columns of another array in Python?
I want to do the following:
Kn[0, 0] = KeTrans[startPosRow, startPosCol];
Kn[0, 1] = KeTrans[startPosRow, endPosCol];
Kn[1, 0] = KeTrans[endPosRow, startPosCol];
Kn[1, 1] = KeTrans[endPosRow, endPosCol];
Kn is a 2X2 matrix and KeTrans is a 4X4.
I tried the following but with no luck.
Kn[0:1, 0:1] = KeTrans[startPosRow: endPosRow, startPosCol: endPosCol]
But they're not the same rows and columns :-) (unless startPosRow and friends have very specific values).
The problem is that the slice startPosRow:endPosRow (for example) does not mean "element startPosRow and element endPosRow". It means "elements in range(startPosRow, endPosRow)", which doesn't include endPosRow and which typically has more than two matching indices.
If you just want the four corners, you could use slices with a step size:
Kn[0:1, 0:1] = KeTrans[startPosRow:endPosRow + 1:endPosRow - startPosRow,
startPosCol:endPosCol + 1:endPosCol - startPosCol]
For multi-dimensional arrays, I highly suggest use Numpy.
import numpy as np
To create an Nth-dimensional array:
a = np.array([4,2,4],[23,4,3,2]...,[2,3,4])
The array are indexed very intuitively:
>> a[0,1]
4
You can even do slicing for the np array.
documentation of numpy multi-dimensional array: https://numpy.org/doc/stable/reference/arrays.ndarray.html
Is this what you are trying to do:
In [323]: X = np.arange(16).reshape(4,4)
In [324]: Y = np.zeros((2,2),int)
In [325]: Y[:] = X[:2,:2]
In [326]: Y
Out[326]:
array([[0, 1],
[4, 5]])
In [327]: Y[:] = X[1:3,2:]
In [328]: Y
Out[328]:
array([[ 6, 7],
[10, 11]])
For reference
In [329]: X
Out[329]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In Python, i need to split two rows in half, take the first half from row 1 and second half from row 2 and concatenate them into an array which is then saved as a row in another 2d array. for example
values=np.array([[1,2,3,4],[5,6,7,8]])
will become
Y[2,:]= ([1,2,7,8])) // 2 is arbitrarily chosen
I tried doing this with concatenate but got an error
only integer scalar arrays can be converted to a scalar index
x=values.shape[1]
pop[y,:]=np.concatenate(values[temp0,0:int((x-1)/2)],values[temp1,int((x-1)/2):x+1])
temp0 and temp1 are integers, and values is a 2d integer array of dimensions (100,x)
np.concatenate takes a list of arrays, plus a scalar axis parameter (optional)
In [411]: values=np.array([[1,2,3,4],[5,6,7,8]])
...:
Nothing wrong with how you split values:
In [412]: x=values.shape[1]
In [413]: x
Out[413]: 4
In [415]: values[0,0:int((x-1)/2)],values[1,int((x-1)/2):x+1]
Out[415]: (array([1]), array([6, 7, 8]))
wrong:
In [416]: np.concatenate(values[0,0:int((x-1)/2)],values[1,int((x-1)/2):x+1])
----
TypeError: only integer scalar arrays can be converted to a scalar index
It's trying to interpret the 2nd argument as an axis parameter, hence the scalar error message.
right:
In [417]: np.concatenate([values[0,0:int((x-1)/2)],values[1,int((x-1)/2):x+1]])
Out[417]: array([1, 6, 7, 8])
There are other concatenate front ends. Here hstack would work the same. np.append takes 2 arrays, so would work - but too often people use it wrongly. np.r_ is another front end with different syntax.
The indexing might be clearer with:
In [423]: idx = (x-1)//2
In [424]: np.concatenate([values[0,:idx],values[1,idx:]])
Out[424]: array([1, 6, 7, 8])
Try numpy.append
numpy.append Documentation
np.append(values[temp0,0:int((x-1)/2)],values[temp1,int((x-1)/2):x+1])
You don't need splitting and/or concatenation. Just use indexing:
In [47]: values=np.array([[1,2,3,4],[5,6,7,8]])
In [48]: values[[[0], [1]],[[0, 1], [-2, -1]]]
Out[48]:
array([[1, 2],
[7, 8]])
Or ravel to get the flattened version:
In [49]: values[[[0], [1]],[[0, 1], [-2, -1]]].ravel()
Out[49]: array([1, 2, 7, 8])
As a more general approach you can also utilize np.r_ as following:
In [61]: x, y = values.shape
In [62]: values[np.arange(x)[:,None],[np.r_[0:y//2], np.r_[-y//2:0]]].ravel()
Out[62]: array([1, 2, 7, 8])
Reshape to split the second dimension in two; stack the part you want.
a = np.array([[1,2,3,4],[5,6,7,8]])
b = a.reshape(a.shape[0], a.shape[1]//2, 2)
new_row = np.hstack([b[0,0,:], b[1,1,:]])
#new_row = np.hstack([b[0,0], b[1,1]])
I understand how
x=np.array([[1, 2], [3, 4], [5, 6]]
y = x[[0,1,2], [0,1,0]]
Output gives y= [1 4 5] This just takes the first list as rows and seconds list and columns.
But how does the the below work?
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
rows = np.array([[0,0],[3,3]])
cols = np.array([[0,2],[0,2]])
y = x[rows,cols]
This gives the output of :
[[ 0 2]
[ 9 11]]
Can you please explain the logic when using ndarrays as slicing object? Why does it have a 2d array for both rows and columns. How are the rules different when the slicing object is a ndarray as opposed to a python list?
We've the following array x
x = np.array([[1, 2], [3, 4], [5, 6]]
And the indices [0, 1, 2] and [0, 1, 0] which when indexed into x like
x[[0,1,2], [0,1,0]]
gives
[1, 4, 5]
The indices that we used basically translates to:
[0, 1, 2] & [0, 1, 0] --> [0,0], [1,1], [2,0]
Since we used 1D list as indices, we get 1D array as result.
With that knowledge, let's see the next case. Now, we've the array x as:
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
Now the indices are 2D arrays.
rows = np.array([[0,0],[3,3]])
cols = np.array([[0,2],[0,2]])
This when indexed into the array x like:
x[rows,cols]
simply translates to:
[[0,0],[3,3]]
| | | | ====> [[0,0]], [[0,2]], [[3,0]], [[3,2]]
[[0,2],[0,2]]
Now, it's easy to observe how these 4 list of list when indexed into the array x would give the following result (i.e. here it simply returns the corner elements from our array x):
[[ 0, 2]
[ 9, 11]]
Note that in this case we get the result as a 2D array (as opposed to 1D array in the first case) since our indices rows & columns itself were 2D arrays (i.e. equivalently list of list) whereas in the first case our indices were 1D arrays (or equivalently simple list without any nesting).
So, if you need 2D arrays as result, you need to give 2D arrays as indices.
The easiest way to wrap one's head around this is the following observation: The shape of the output is determined by the shape of the index array, or more precisely the shape resulting from broadcasting all the index arrays together.
Look at it like that: you have an array A of a given shape and another array V of some other shape and you want to fill A with values from V. What do you need to specify? Well, for each position in A you need to specify coordinates of some element in V. Therefore if V is ND you need N index arrays of the same shape as A or at least broadcastable to that. Then you index V by putting these index arrays at their coordinate positions in the [] expression.
To stay simple, we'll stay 2D and assume rows.shape = cols.shape. (You can break this rule with broadcasting, but for now we won't). We'll call this shape (I, J)
then y = x[rows, cols] is the same as:
y = np.empty((I, J))
for i in range(I):
for j in range(J):
y[i, j] = x[rows[i, j], cols[i, j]]
I have a 3d numpy array (n_samples x num_components x 2) in the example below n_samples = 5 and num_components = 7.
I have another array (indices) which is the selected component for each sample which is of shape (n_samples,).
I want to select from the data array given the indices so that the resulting array is n_samples x 2.
The code is below:
import numpy as np
np.random.seed(77)
data=np.random.randint(low=0, high=10, size=(5, 7, 2))
indices = np.array([0, 1, 6, 4, 5])
#how can I select indices from the data array?
For example for data 0, the selected component should be the 0th and for data 1 the selected component should be 1.
Note that I can't use any for loops because I'm using it in Theano and the solution should be solely based on numpy.
Is this what you are looking for?
In [36]: data[np.arange(data.shape[0]),indices,:]
Out[36]:
array([[7, 4],
[7, 3],
[4, 5],
[8, 2],
[5, 8]])
To get component #0, use
data[:, 0]
i.e. we get every entry on axis 0 (samples), and only entry #0 on axis 1 (components), and implicitly everything on the remaining axes.
This can be easily generalized to
data[:, indices]
to select all relevant components.
But what OP really wants is just the diagonal of this array, i.e. (data[0, indices[0]], (data[1, indices[1]]), ...) The diagonal of a high-dimensional array can be extracted using the diagonal function:
>>> np.diagonal(data[:, indices])
array([[7, 7, 4, 8, 5],
[4, 3, 5, 2, 8]])
(You may need to transpose the result.)
You have a variety of ways to do so, but this is my loop recommendation:
selection = np.array([ datum[indices[k]] for k,datum in enumerate(data)])
The resulting array, selection, has the desired shape.
I am trying to pull out a particular slice of a numpy array but don't know how to express it with a tuple of indices. Using a tuple of indices works if its length is the same as the number of dimensions:
ind = (1,2,3)
# these two values are the same
foo[1,2,3]
foo[ind]
But if I want to get a slice along one dimension the same notation doesn't work:
ind = (2,3)
# these two values are not the same
foo[:,2,3]
foo[:,ind]
# and this isn't even proper syntax
foo[:,*ind]
So is there a way to use a named tuple of indices together with slices?
Instead of using the : syntax you can explicitly create the slice object and add that to the tuple:
ind = (2, 3)
s = slice(None) # equivalent to ':'
foo[(s,) + ind] # add s to tuples
In contrast to using foo[:, ind], the result of this should be the same as foo[:,2,3].
For accessing 2D arrays...
I believe what you are suggesting should work. Be mindful that numpy arrays index starting from 0. So to pull the first and third column from the following matrix I use column indices 0 and 2.
import numpy as np
foo = np.array([[1,2,3],[4,5,6],[7,8,9]])
ind = (0,2)
foo[:,ind]
For accessing 3D arrays...
3D numpy arrays are accessed by 3 values x[i,j,k] where "i" represents the first matrix slice, or
[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]]
from my example below. "j" represents the second matrix slice, or the rows of these matrices. And "k" represents their columns. i,j and k can be :, integer or tuple. So we can access particular slices by using two sets of named tuples as follows.
import numpy as np
foo2 = np.array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
ind1 = (1,2)
ind2 = (0,1)
foo2[:,ind1,ind2]