I know how to do this if I know the number of dimensions of the array when coding. I have seen Select 'area' from a 2D array in python
I am trying to figure out how to extract a "volume" from an any dimensional array.
I know how to slice arrays. a[0: 10] I know how to use that.
What I essentially want is a [lower_bound: higher_bound]. But the bounds are arrays that specify the locations in each dimension.
Something like 0:2, 2:4 in the answer you linked is just a tuple of range objects. You can create such a tuple yourself using any code you want, and then do a[slice(*t)] (ref) to slice with that tuple.
Related
Suppose I have multiple NxN 2D arrays stored into a list in Python 3. I want to collapse all the arrays into 1 array, with the same dimensions NxN, but such that each element of this new array contains a 1xN array of the corresponding values from the original arrays.
To give you some more context, each array in this list corresponds to the set of values at a given time. For each new time point, I am storing the updated version of that array into the list. Once that's done, I want to compute the standard deviation of the values at each (i,j) element in the array.
I tried using a for loop, but it takes far too long for my simulations because this is a set of 100,000 arrays. I was wondering if there were any numpy or vectorized functions that can help me perform this operation more efficiently. Thanks!
Lets say l is your list of arrays. You need to get std of corresponding elements of those arrays into a single array:
std_l = np.std(np.stack(l),axis=0)
There is this great Question/Answer about slicing the last dimension:
Numpy slice of arbitrary dimensions: for slicing a numpy array to obtain the i-th index in the last dimension, one can use ... or Ellipsis,
slice = myarray[...,i]
What if the first N dimensions are needed ?
For 3D myarray, N=2:
slice = myarray[:,:,0]
For 4D myarray, N=2:
slice = myarray[:,:,0,0]
Does this can be generalized to an arbitrary dimension?
I don't think there's any built-in syntactic sugar for that, but slices are just objects like anything else. The slice(None) object is what is created from :, and otherwise just picking the index 0 works fine.
myarray[(slice(None),)*N+(0,)*(myarray.ndim-N)]
Note the comma in (slice(None),). Python doesn't create tuples from parentheses by default unless the parentheses are empty. The comma signifies that don't just want to compute whatever's on the inside.
Slices are nice because they give you a view into the object instead of a copy of the object. You can use the same idea to, e.g., iterate over everything except the N-th dimension on the N-th dimension. There have been some stackoverflow questions about that, and they've almost unanimously resorted to rolling the indices and other things that I think are hard to reason about in high-dimensional spaces. Slice tuples are your friend.
From the comments, #PaulPanzer points out another technique that I rather like.
myarray.T[(myarray.ndim-N)*(0,)].T
First, transposes in numpy are view-operations instead of copy-operations. This isn't inefficient in the slightest. Here's how it works:
Start with myarray with dimensions (0,...,k)
The transpose myarray.T reorders those to (k,...,0)
The whole goal is to fix the last myarray.ndim-N dimensions from the original array, so we select those with [(myarray.ndim-N)*(0,)], which grabs the first myarray.ndim-N dimensions from this array.
They're in the wrong order. We have dimensions (N-1,...,0). Use another transpose with .T to get the ordering (0,...,N-1) instead.
I am working with multiple multidimensional arrays. Let us consider dummy example for simplicity:
array_list=[np.ones(3), np.ones((3,3,3)), np.ones((3,3)), np.ones(3)]
I need to subscribe the outermost dimension of each array in the list. For example, my goal is to set some of the elements to zero according to a specified range in the outermost dimension:
array_list[0][0:2]=0
array_list[1][:,:,0:2]=0
array_list[2][:,0:2]=0
array_list[3][0:2]=0
In my real application I don't know how many arrays I have and how many dimensions are in there.
I would like to do the task in a for loop:
for array in array_list:
array[???]=0
But I am struggling how to implement this if I don't know the dimensionality of each array.
Use the Ellipsis to select all dimensions except the last (if there's only 1, nothing is selected).
for array in array_list:
array[..., 0:2] = 0
I have an array of 2d indices.
indices = [[2,4], [6,77], [102,554]]
Now, I have a different 4-dimensional array, arr, and I want to only extract an array (it is an array, since it is 4-dimensional) with corresponding index in the indices array. It is equivalent to the following code.
for i in range(len(indices)):
output[i] = arr[indices[i][0], indices[i][1]]
However, I realized that using explicit for-loop yields a slow result. Is there any built-in numpy API that I can utilized? At this point, I tried using np.choose, np.put, np.take, but did not succeed to yield what I wanted. Thank you!
We need to index into the first two axes with the two columns from indices (thinking of it as an array).
Thus, simply convert to array and index, like so -
indices_arr = np.array(indices)
out = arr[indices_arr[:,0], indices_arr[:,1]]
Or we could extract those directly without converting to array and then index -
d0,d1 = [i[0] for i in indices], [i[1] for i in indices]
out = arr[d0,d1]
Another way to extract the elements would be with conversion to tuple, like so -
out = arr[tuple(indices_arr.T)]
If indices is already an array, skip the conversion process and use indices in places where we had indices_arr.
Try using the take function of numpy arrays. Your code should be something like:
outputarray= np.take(arr,indices)
I'm trying to get values from an ndarray with indices in another ndarray but I keep getting this error
IndexError too many indices for array.
The array that I'm trying to get the values from, scores , has scores.shape = (10,10000)
and the array pointing out the indices, indices , has indices.shape = (10000,2)
I'm trying to get the values this way:
values = scores[tuple(indices)]
but this is where I get the error.
What I'm trying to do this way is to access several individual values of scores, e.g. scores[0,6], scores[1,9] in another array so I get something like
[scores[0,6],scores[1,9],...]
all in one go and avoiding loops. Those [[0,6] , [1,9], ...] are stored in the indices array. I mention the previous in case that could lead to a work around.
Try the following: scores[indices[:,0],indices[:,1]]. Or alternatively, scores[tuple(indices.T)].
When you do scores[tuple(indices)], tuple(indices) is creating a tuple of 2-element arrays. Numpy interprets this as you trying to get 2 elements of a 10,000 dimensional array! For the sort of indexing you need, Numpy expects arrays of values for each dimension. In other words, rather than ( [x1,y1], [x2,y2] ), it wants ( [x1,x2], [y1, y2] ).