numpy is double transposition necessary in this specific case? - python

I have an array
xx = np.arange(24).reshape(2, 12)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])
and I would like to reshape it, to obtain
array([[[ 0, 1, 2, 3],
[12, 13, 14, 15]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 8, 9, 10, 11],
[20, 21, 22, 23]]])
I can achieve it via
xx.T.reshape(3, 4, 2).transpose(0, 2, 1)
But it has to be transposed twice, which seems unnecessary to me. So could somebody confirm that this is the only way of doing it or provide more readable solution otherwise?
Thanks!

It is possible to do a single transpose:
data = np.arange(24).reshape(2, 12)
data = data.reshape(2, 3, 4).transpose(1, 0, 2)
Edit:
I checked this using itertools.permutations and itertools.product:
import itertools
import numpy as np
data = np.arange(24).reshape(2, 12)
desired_data = np.array([[[ 0, 1, 2, 3],
[12, 13, 14, 15]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 8, 9, 10, 11],
[20, 21, 22, 23]]])
shapes = [2, 3, 4]
transpose_dims = [0, 1, 2]
shape_permutations = itertools.permutations(shapes)
transpose_permutations = itertools.permutations(transpose_dims)
for shape, transpose in itertools.product(
list(shape_permutations),
list(transpose_permutations),
):
new_data = data.reshape(*shape).transpose(*transpose)
try:
np.allclose(new_data, desired_data)
except ValueError as e:
pass
else:
break
print(f"{shape=}, {transpose=}")
shape=(2, 3, 4), transpose=(1, 0, 2)

I would do it this way: first, generate two arrays (shown separated for the sake of decomposition):
xx.reshape(2, -1, 4)
# Output:
# array([[[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]],
#
# [[12, 13, 14, 15],
# [16, 17, 18, 19],
# [20, 21, 22, 23]]])
From here, I would then stack along the second dimension in order to combine them like you want:
np.stack(xx.reshape(2, -1, 4), axis=1)
# Output:
# array([[[ 0, 1, 2, 3],
# [12, 13, 14, 15]],
#
# [[ 4, 5, 6, 7],
# [16, 17, 18, 19]],
#
# [[ 8, 9, 10, 11],
# [20, 21, 22, 23]]])
You'd avoid the transposition. Hopefully it's more readable, but in the end, that's highly subjective, right? '^^

To add on top of #Paul's answer, there is some speedup from removing one of the transpose. The time gain is of ~15%:

Related

How to transpose first 2 dimensions in 3D array

How do I transpose the first 2 dimensions of a 3D array 'matrix?
matrix = np.random.rand(2,3,4)
In the third dimensions I want to swap 'rows' with 'columns', preferably without a loop.
You can use the .transpose() function.
matrix = matrix.transpose(1, 0, 2)
means swap the first and the second axis.
You can use swapaxes:
matrix2 = matrix.swapaxes(0,1)
example:
# input
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
# output
array([[[ 0, 1, 2, 3],
[12, 13, 14, 15]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 8, 9, 10, 11],
[20, 21, 22, 23]]])

Can I combine non-adjacent dimensions in a NumPy array without copying data?

I would like to combine the first and the last dimension of a 3-D NumPy array into one dimension, without copying the data:
import numpy as np
data = np.empty((3, 4, 5))
data = data.transpose([0, 2, 1])
try:
# this fails, indicating that it is not possible:
# AttributeError: incompatible shape for a non-contiguous array
data.shape = (-1, 4)
except AttributeError:
# this creates a copy of the data:
data = data.reshape((-1, 4))
Is this possible?
In [55]: arr = np.arange(24).reshape(2,3,4)
In [56]: arr1 = arr.transpose(2,1,0)
In [57]: arr
Out[57]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [58]: arr1
Out[58]:
array([[[ 0, 12],
[ 4, 16],
[ 8, 20]],
[[ 1, 13],
[ 5, 17],
[ 9, 21]],
[[ 2, 14],
[ 6, 18],
[10, 22]],
[[ 3, 15],
[ 7, 19],
[11, 23]]])
Look at how the values are laid out in the 1d data buffer:
In [59]: arr.ravel()
Out[59]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23])
compare the order after the transpose:
In [60]: arr1.ravel()
Out[60]:
array([ 0, 12, 4, 16, 8, 20, 1, 13, 5, 17, 9, 21, 2, 14, 6, 18, 10,
22, 3, 15, 7, 19, 11, 23])
If the raveled values don't have the same order, you can't avoid a copy.
reshape has this note:
You can think of reshaping as first raveling the array (using the given
index order), then inserting the elements from the raveled array into the
new array using the same kind of index ordering as was used for the
raveling.
In [63]: arr1.reshape(-1,2)
Out[63]:
array([[ 0, 12],
[ 4, 16],
[ 8, 20],
[ 1, 13],
[ 5, 17],
[ 9, 21],
[ 2, 14],
[ 6, 18],
[10, 22],
[ 3, 15],
[ 7, 19],
[11, 23]])

Multi Dimensional Numpy Array Sorting

I have (5,5) np.array like below.
>>> a
array([[23, 15, 11, 0, 17],
[ 1, 2, 20, 4, 6],
[16, 22, 8, 10, 18],
[ 7, 12, 13, 14, 5],
[ 3, 9, 21, 19, 24]])
I want to multi dimensional sort the np.array to look like below.
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
To do that I did,
flatten() the array.
Sort the flatted array.
Reshape to (5,5)
In my method I feel like it is a bad programming practice.Are there any sophisticated way to do that task?
Thank you.
>>> a array([[23, 15, 11, 0, 17],
[ 1, 2, 20, 4, 6],
[16, 22, 8, 10, 18],
[ 7, 12, 13, 14, 5],
[ 3, 9, 21, 19, 24]])
>>> a_flat = a.flatten()
>>> a_flat
array([23, 15, 11, 0, 17, 1, 2, 20, 4, 6, 16, 22, 8, 10, 18, 7, 12,
13, 14, 5, 3, 9, 21, 19, 24])
>>> a_sort = np.sort(a_flat)
>>> a_sorted = a_sort.reshape(5,5)
>>> a_sorted
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
We could get a flattened view with np.ravel() and then sort in-place with ndarray.sort() -
a.ravel().sort()
Being everything in-place, it avoids creating any temporary array and also maintains the shape, which avoids any need of reshape.
Sample run -
In [18]: a
Out[18]:
array([[23, 15, 11, 0, 17],
[ 1, 2, 20, 4, 6],
[16, 22, 8, 10, 18],
[ 7, 12, 13, 14, 5],
[ 3, 9, 21, 19, 24]])
In [19]: a.ravel().sort()
In [20]: a
Out[20]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])

Preserving sequential order of numpy 2D arrays

Given this 2D numpy array:
a=numpy.array([[31,22,43],[44,55,6],[17,68,19],[12,11,18],...,[99,98,97]])
given the need to flatten it using numpy.ravel:
b=numpy.ravel(a)
and given the need to later dump it into a pandas dataframe, how can I make sure the sequential order of the values in a is preserved when applying numpy.ravel? e.g., How can I check/ensure that numpy.ravel does not mess up with the original sequential order?
Of course the intended result should be that the numbers coming before and after 17 in b, for instance, are the same as in a.
First of all you need to formulate what "sequential" order means for you, as numpy.ravel() does preserve order. Here is a tip how to formulate what you need: try with a simplest possible toy example:
import numpy as np
X = np.arange(20).reshape(-1,4)
X
#array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11],
# [12, 13, 14, 15],
# [16, 17, 18, 19]])
X.ravel()
# array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
# 13, 14, 15, 16, 17, 18, 19])
Does it meet your expectation? Or you want to see this order:
Z = X.T
Z
# array([[ 0, 4, 8, 12, 16],
# [ 1, 5, 9, 13, 17],
# [ 2, 6, 10, 14, 18],
# [ 3, 7, 11, 15, 19]])
Z.ravel()
# array([ 0, 4, 8, 12, 16, 1, 5, 9, 13, 17, 2, 6, 10,
# 14, 18, 3, 7, 11, 15, 19])

Shuffle ordering of some rows in numpy array

I want to shuffle the ordering of only some rows in a numpy array. These rows will always be continuous (e.g. shuffling rows 23-80). The number of elements in each row can vary from 1 (such that the array is actually 1D) to 100.
Below is example code to demonstrate how I see the method shuffle_rows() could work. How would I design such a method to do this shuffling efficiently?
import numpy as np
>>> a = np.arange(20).reshape(4, 5)
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
>>> shuffle_rows(a, [1, 3]) # including rows 1, 2 and 3 in the shuffling
array([[ 0, 1, 2, 3, 4],
[15, 16, 17, 18, 19],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
You can use np.random.shuffle. This shuffles the rows themselves, not the elements within the rows.
From the docs:
This function only shuffles the array along the first index of a multi-dimensional array
As an example:
import numpy as np
def shuffle_rows(arr,rows):
np.random.shuffle(arr[rows[0]:rows[1]+1])
a = np.arange(20).reshape(4, 5)
print(a)
# array([[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19]])
shuffle_rows(a,[1,3])
print(a)
#array([[ 0, 1, 2, 3, 4],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19],
# [ 5, 6, 7, 8, 9]])
shuffle_rows(a,[1,3])
print(a)
#array([[ 0, 1, 2, 3, 4],
# [10, 11, 12, 13, 14],
# [ 5, 6, 7, 8, 9],
# [15, 16, 17, 18, 19]])

Categories

Resources