Related
I have a numpy array of shape (294, 62, 350). Along the third dimension (the 350), I need to combine every two columns into one longer one which would result in an array of shape (294, 124, 175). For example if I have this array:
a_3d_array = np.array([[[1, 2, 3, 6, 1, 2], [3, 4, 3, 6, 1, 4]],
[[5, 2, 2, 1, 4, 2], [2, 9, 4, 3, 2, 7]]])
The expected output would be:
expected_output = np.array([[[5, 2, 4], [2, 4, 2], [ 2, 1, 2], [9, 3, 7]],
[[1, 3, 1], [3, 3, 1], [2, 6, 2], [4, 6, 4]]])
Sorry as I'm new to python and I don't have a clue how to approach this and thus I don't have a "my own attempt" to include here.
a_3d_array = np.array([[[1, 2, 3, 6, 1, 2], [3, 4, 3, 6, 1, 4]],
[[5, 2, 2, 1, 4, 2], [2, 9, 4, 3, 2, 7]]])
output = np.hstack([a_3d_array[:, :, ::2], a_3d_array[:, :, 1::2]])
To combine every N-th column:
N = 3
output = np.hstack([an_array[:, :, idx::N] for idx in range(N)])
You can reshape and reverse the first dimension:
a_3d_array.reshape((2,4,3), order='F')[::-1]
If you don't know the shape:
x,y,z = a_3d_array.shape
a_3d_array.reshape((x,y*2,-1), order='F')[::-1]
output:
array([[[5, 2, 4],
[2, 4, 2],
[2, 1, 2],
[9, 3, 7]],
[[1, 3, 1],
[3, 3, 1],
[2, 6, 2],
[4, 6, 4]]])
Suppose I have a matrix A with some arbitrary values:
array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
And a matrix B which contains indices of elements in A:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
How do I select values from A pointed by B, i.e.:
A[B] = [[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]]
EDIT: np.take_along_axis is a builtin function for this use case implemented since numpy 1.15. See #hpaulj 's answer below for how to use it.
You can use NumPy's advanced indexing -
A[np.arange(A.shape[0])[:,None],B]
One can also use linear indexing -
m,n = A.shape
out = np.take(A,B + n*np.arange(m)[:,None])
Sample run -
In [40]: A
Out[40]:
array([[2, 4, 5, 3],
[1, 6, 8, 9],
[8, 7, 0, 2]])
In [41]: B
Out[41]:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
In [42]: A[np.arange(A.shape[0])[:,None],B]
Out[42]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
In [43]: m,n = A.shape
In [44]: np.take(A,B + n*np.arange(m)[:,None])
Out[44]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
More recent versions have added a take_along_axis function that does the job:
A = np.array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
B = np.array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
np.take_along_axis(A, B, 1)
Out[]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
There's also a put_along_axis.
I know this is an old question, but another way of doing it using indices is:
A[np.indices(B.shape)[0], B]
output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]
Following is the solution using for loop:
outlist = []
for i in range(len(B)):
lst = []
for j in range(len(B[i])):
lst.append(A[i][B[i][j]])
outlist.append(lst)
outarray = np.asarray(outlist)
print(outarray)
Above can also be written in more succinct list comprehension form:
outlist = [ [A[i][B[i][j]] for j in range(len(B[i]))]
for i in range(len(B)) ]
outarray = np.asarray(outlist)
print(outarray)
Output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]
I want to use this code on very huge array. this code take long time to execute and it is not efficient.
is there any way to remove loop and convert this code to optimum way?
>>> import numpy as np
>>> x=np.random.randint(10, size=(4,5,3))
>>> x
array([[[3, 2, 6],
[4, 6, 6],
[3, 7, 9],
[6, 4, 2],
[9, 0, 1]],
[[9, 0, 4],
[1, 8, 9],
[6, 8, 1],
[9, 4, 5],
[1, 5, 2]],
[[6, 1, 6],
[1, 8, 8],
[3, 8, 3],
[7, 1, 0],
[7, 7, 0]],
[[5, 6, 6],
[8, 3, 1],
[0, 5, 4],
[6, 1, 2],
[5, 6, 1]]])
>>> y=[]
>>> for i in range(x.shape[1]):
for j in range(x.shape[2]):
y.append(x[:, i, j].tolist())
>>> y
[[3, 9, 6, 5], [2, 0, 1, 6], [6, 4, 6, 6], [4, 1, 1, 8], [6, 8, 8, 3], [6, 9, 8, 1], [3, 6, 3, 0], [7, 8, 8, 5], [9, 1, 3, 4], [6, 9, 7, 6], [4, 4, 1, 1], [2, 5, 0, 2], [9, 1, 7, 5], [0, 5, 7, 6], [1, 2, 0, 1]]
You could permute axes with np.transpose and then reshape to 2D -
y = x.transpose(1,2,0).reshape(-1,x.shape[0])
Append with .tolist() for list output.
yes, either use np.reshape(x, shape) or try it with np.ndarray.flatten(x, order='F') (F for Fortran style, column first, according to your example).
read the documentation to find out which parameters fit the best. IMHO, I think ndarray.flatten is the better and more elegant option for you here. However, depending on your exact wanted solution, you might have to reshape the array first.
I have a 2D Numpy ndarray, x, that I need to split in square subregions of size s. For each subregion, I want to get the greatest element (which I do), and its position within that subregion (which I can't figure out).
Here is a minimal example:
>>> x = np.random.randint(0, 10, (6,8))
>>> x
array([[9, 4, 8, 9, 5, 7, 3, 3],
[3, 1, 8, 0, 7, 7, 5, 1],
[7, 7, 3, 6, 0, 2, 1, 0],
[7, 3, 9, 8, 1, 6, 7, 7],
[1, 6, 0, 7, 5, 1, 2, 0],
[8, 7, 9, 5, 8, 3, 6, 0]])
>>> h, w = x.shape
>>> s = 2
>>> f = x.reshape(h//s, s, w//s, s)
>>> mx = np.max(f, axis=(1, 3))
>>> mx
array([[9, 9, 7, 5],
[7, 9, 6, 7],
[8, 9, 8, 6]])
For example, the 8 in the lower left corner of mx is the greatest element from subregion [[1,6], [8, 7]] in the lower left corner of x.
What I want is to get an array similar to mx, that keeps the indices of the largest elements, like this:
[[0, 1, 1, 2],
[0, 2, 3, 2],
[2, 2, 2, 2]]
where, for example, the 2 in the lower left corner is the index of 8 in the linear representation of [[1, 6], [8, 7]].
I could do it like this: np.argmax(f[i, :, j, :]) and iterate over i and j, but the speed difference is enormous for large amounts of computation. To give you an idea, I'm trying to use (only) Numpy for max pooling. Basically, I'm asking if there is a faster alternative than what I'm using.
Here's one approach -
# Get shape of output array
m,n = np.array(x.shape)//s
# Reshape and permute axes to bring the block as rows
x1 = x.reshape(h//s, s, w//s, s).swapaxes(1,2).reshape(-1,s**2)
# Use argmax along each row and reshape to output shape
out = x1.argmax(1).reshape(m,n)
Sample input, output -
In [362]: x
Out[362]:
array([[9, 4, 8, 9, 5, 7, 3, 3],
[3, 1, 8, 0, 7, 7, 5, 1],
[7, 7, 3, 6, 0, 2, 1, 0],
[7, 3, 9, 8, 1, 6, 7, 7],
[1, 6, 0, 7, 5, 1, 2, 0],
[8, 7, 9, 5, 8, 3, 6, 0]])
In [363]: out
Out[363]:
array([[0, 1, 1, 2],
[0, 2, 3, 2],
[2, 2, 2, 2]])
Alternatively, to simplify things, we could use scikit-image that does the heavy work of reshaping and permuting axes for us -
In [372]: from skimage.util import view_as_blocks as viewB
In [373]: viewB(x, (s,s)).reshape(-1,s**2).argmax(1).reshape(m,n)
Out[373]:
array([[0, 1, 1, 2],
[0, 2, 3, 2],
[2, 2, 2, 2]])
I have a numpy array say
a = array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
I have an array 'replication' of the same size where replication[i,j](>=0) denotes how many times a[i][j] should be repeated along the row. Obiviously, replication array follows the invariant that np.sum(replication[i]) have the same value for all i.
For example, if
replication = array([[1, 2, 1],
[1, 1, 2],
[2, 1, 1]])
then the final array after replicating is:
new_a = array([[1, 2, 2, 3],
[4, 5, 6, 6],
[7, 7, 8, 9]])
Presently, I am doing this to create new_a:
##allocate new_a
h = a.shape[0]
w = a.shape[1]
for row in range(h):
ll = [[a[row][j]]*replicate[row][j] for j in range(w)]
new_a[row] = np.array([item for sublist in ll for item in sublist])
However, this seems to be too slow as it involves using lists. Can I do the intended entirely in numpy, without the use of python lists?
You can flatten out your replication array, then use the .repeat() method of a:
import numpy as np
a = array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
replication = array([[1, 2, 1],
[1, 1, 2],
[2, 1, 1]])
new_a = a.repeat(replication.ravel()).reshape(a.shape[0], -1)
print(repr(new_a))
# array([[1, 2, 2, 3],
# [4, 5, 6, 6],
# [7, 7, 8, 9]])