Related
I have an array
xx = np.arange(24).reshape(2, 12)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])
and I would like to reshape it, to obtain
array([[[ 0, 1, 2, 3],
[12, 13, 14, 15]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 8, 9, 10, 11],
[20, 21, 22, 23]]])
I can achieve it via
xx.T.reshape(3, 4, 2).transpose(0, 2, 1)
But it has to be transposed twice, which seems unnecessary to me. So could somebody confirm that this is the only way of doing it or provide more readable solution otherwise?
Thanks!
It is possible to do a single transpose:
data = np.arange(24).reshape(2, 12)
data = data.reshape(2, 3, 4).transpose(1, 0, 2)
Edit:
I checked this using itertools.permutations and itertools.product:
import itertools
import numpy as np
data = np.arange(24).reshape(2, 12)
desired_data = np.array([[[ 0, 1, 2, 3],
[12, 13, 14, 15]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 8, 9, 10, 11],
[20, 21, 22, 23]]])
shapes = [2, 3, 4]
transpose_dims = [0, 1, 2]
shape_permutations = itertools.permutations(shapes)
transpose_permutations = itertools.permutations(transpose_dims)
for shape, transpose in itertools.product(
list(shape_permutations),
list(transpose_permutations),
):
new_data = data.reshape(*shape).transpose(*transpose)
try:
np.allclose(new_data, desired_data)
except ValueError as e:
pass
else:
break
print(f"{shape=}, {transpose=}")
shape=(2, 3, 4), transpose=(1, 0, 2)
I would do it this way: first, generate two arrays (shown separated for the sake of decomposition):
xx.reshape(2, -1, 4)
# Output:
# array([[[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]],
#
# [[12, 13, 14, 15],
# [16, 17, 18, 19],
# [20, 21, 22, 23]]])
From here, I would then stack along the second dimension in order to combine them like you want:
np.stack(xx.reshape(2, -1, 4), axis=1)
# Output:
# array([[[ 0, 1, 2, 3],
# [12, 13, 14, 15]],
#
# [[ 4, 5, 6, 7],
# [16, 17, 18, 19]],
#
# [[ 8, 9, 10, 11],
# [20, 21, 22, 23]]])
You'd avoid the transposition. Hopefully it's more readable, but in the end, that's highly subjective, right? '^^
To add on top of #Paul's answer, there is some speedup from removing one of the transpose. The time gain is of ~15%:
I would like to combine the first and the last dimension of a 3-D NumPy array into one dimension, without copying the data:
import numpy as np
data = np.empty((3, 4, 5))
data = data.transpose([0, 2, 1])
try:
# this fails, indicating that it is not possible:
# AttributeError: incompatible shape for a non-contiguous array
data.shape = (-1, 4)
except AttributeError:
# this creates a copy of the data:
data = data.reshape((-1, 4))
Is this possible?
In [55]: arr = np.arange(24).reshape(2,3,4)
In [56]: arr1 = arr.transpose(2,1,0)
In [57]: arr
Out[57]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [58]: arr1
Out[58]:
array([[[ 0, 12],
[ 4, 16],
[ 8, 20]],
[[ 1, 13],
[ 5, 17],
[ 9, 21]],
[[ 2, 14],
[ 6, 18],
[10, 22]],
[[ 3, 15],
[ 7, 19],
[11, 23]]])
Look at how the values are laid out in the 1d data buffer:
In [59]: arr.ravel()
Out[59]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23])
compare the order after the transpose:
In [60]: arr1.ravel()
Out[60]:
array([ 0, 12, 4, 16, 8, 20, 1, 13, 5, 17, 9, 21, 2, 14, 6, 18, 10,
22, 3, 15, 7, 19, 11, 23])
If the raveled values don't have the same order, you can't avoid a copy.
reshape has this note:
You can think of reshaping as first raveling the array (using the given
index order), then inserting the elements from the raveled array into the
new array using the same kind of index ordering as was used for the
raveling.
In [63]: arr1.reshape(-1,2)
Out[63]:
array([[ 0, 12],
[ 4, 16],
[ 8, 20],
[ 1, 13],
[ 5, 17],
[ 9, 21],
[ 2, 14],
[ 6, 18],
[10, 22],
[ 3, 15],
[ 7, 19],
[11, 23]])
I have a huge (N*20) matrix where every 5 rows is a valid sample, ie. every (5*20) matrix. I'm trying to reshape it into a (N/5,1,20,5) matrix where the dimension 20 is kept unchanged. I could do it in tensroflow using keep_dim, but how can I achieve this in numpy?
Thanks in advance.
Reshape and then swap the axes around:
arr1 = arr.reshape(N/5,5,1,20)
arr2 = arr1.transpose(0,2,3,1)
for example
In [476]: arr = np.arange(24).reshape(6,4)
In [477]: arr
Out[477]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
In [478]: arr1 = arr.reshape(2,3,1,4)
In [479]: arr2 = arr1.transpose(0,2,3,1)
In [480]: arr2.shape
Out[480]: (2, 1, 4, 3)
In [482]: arr2
Out[482]:
array([[[[ 0, 4, 8],
[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11]]],
[[[12, 16, 20],
[13, 17, 21],
[14, 18, 22],
[15, 19, 23]]]])
I want to shuffle the ordering of only some rows in a numpy array. These rows will always be continuous (e.g. shuffling rows 23-80). The number of elements in each row can vary from 1 (such that the array is actually 1D) to 100.
Below is example code to demonstrate how I see the method shuffle_rows() could work. How would I design such a method to do this shuffling efficiently?
import numpy as np
>>> a = np.arange(20).reshape(4, 5)
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
>>> shuffle_rows(a, [1, 3]) # including rows 1, 2 and 3 in the shuffling
array([[ 0, 1, 2, 3, 4],
[15, 16, 17, 18, 19],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
You can use np.random.shuffle. This shuffles the rows themselves, not the elements within the rows.
From the docs:
This function only shuffles the array along the first index of a multi-dimensional array
As an example:
import numpy as np
def shuffle_rows(arr,rows):
np.random.shuffle(arr[rows[0]:rows[1]+1])
a = np.arange(20).reshape(4, 5)
print(a)
# array([[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19]])
shuffle_rows(a,[1,3])
print(a)
#array([[ 0, 1, 2, 3, 4],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19],
# [ 5, 6, 7, 8, 9]])
shuffle_rows(a,[1,3])
print(a)
#array([[ 0, 1, 2, 3, 4],
# [10, 11, 12, 13, 14],
# [ 5, 6, 7, 8, 9],
# [15, 16, 17, 18, 19]])
I have a numpy array of size nxm. I want the number of columns to be limited to k and rest of the columns to be extended in new rows. Following is the scenario -
Initial array: nxm
Final array: pxk
where p = (m/k)*n
Eg. n = 2, m = 6, k = 2
Initial array:
[[1, 2, 3, 4, 5, 6,],
[7, 8, 9, 10, 11, 12]]
Final array:
[[1, 2],
[7, 8],
[3, 4],
[9, 10],
[5, 6],
[11, 12]]
I tried using reshape but not getting the desired result.
Here's one way to do it
q=array([[1, 2, 3, 4, 5, 6,],
[7, 8, 9, 10, 11, 12]])
r=q.T.reshape(-1,2,2)
s=r.swapaxes(1,2)
t=s.reshape(-1,2)
as a one liner,
q.T.reshape(-1,2,2).swapaxes(1,2).reshape(-1,2)
array([[ 1, 2],
[ 7, 8],
[ 3, 4],
[ 9, 10],
[ 5, 6],
[11, 12]])
EDIT: for the general case, use
q=arange(1,1+n*m).reshape(n,m) #example input
r=q.T.reshape(-1,k,n)
s=r.swapaxes(1,2)
t=s.reshape(-1,k)
one liner is:
q.T.reshape(-1,k,n).swapaxes(1,2).reshape(-1,k)
example for n=3,m=12,k=4
q=array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24],
[25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36]])
result is
array([[ 1, 2, 3, 4],
[13, 14, 15, 16],
[25, 26, 27, 28],
[ 5, 6, 7, 8],
[17, 18, 19, 20],
[29, 30, 31, 32],
[ 9, 10, 11, 12],
[21, 22, 23, 24],
[33, 34, 35, 36]])
Using numpy.vstack and numpy.hsplit:
a = np.array([[1, 2, 3, 4, 5, 6,],
[7, 8, 9, 10, 11, 12]])
n, m, k = 2, 6, 2
np.vstack(np.hsplit(a, m/k))
result array:
array([[ 1, 2],
[ 7, 8],
[ 3, 4],
[ 9, 10],
[ 5, 6],
[11, 12]])
UPDATE As flebool commented, above code is very slow, because hsplit returns a python list, and then vstack reconstructs the final array from a list of arrays.
Here's alternative solution that is much faster.
a.reshape(-1, m/k, k).transpose(1, 0, 2).reshape(-1, k)
or
a.reshape(-1, m/k, k).swapaxes(0, 1).reshape(-1, k)