I am building a neural network. where I have to flatten my training dataset.
I have two options.
1 is:
train_x_flatten = train_x_orig.reshape(train_x_orig.shape[0], -1).T
and 2nd one is:
train_x_flatten = train_x_orig.reshape(train_x_orig.shape[1]*train_x_orig.shape[2]*train_x_orig.shape[3], 209)
both gave the same shape but I found difference while computing cost?
why is that? thank you
Your original tensor is of at least rank 4 based on the second example. The first example pulls each element, ordered by increasing the right-most index, and inserts the elements into rows the length of the zeroth shape. Then transposes.
The second example again pull elements from by incrementing from the right-most index, i.e.:
element = train_x_orig[0, 0, 0, 0]
new_row.append(element)
element = train_x_orig[0, 0, 0, 1]
new_row.append(element)
but the size of the row is different. It is now the dimension of everything else in the tensor.
Here is an example to illustrate.
First we create an ordered array and reshape it to rank 4.
import numpy as np
x = np.arange(36).reshape(3,2,3,2)
x
# returns:
array([[[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]]],
[[[12, 13],
[14, 15],
[16, 17]],
[[18, 19],
[20, 21],
[22, 23]]],
[[[24, 25],
[26, 27],
[28, 29]],
[[30, 31],
[32, 33],
[34, 35]]]])
Here is the output of the first example
x.reshape(x.shape[0], -1).T
# returns:
array([[ 0, 12, 24],
[ 1, 13, 25],
[ 2, 14, 26],
[ 3, 15, 27],
[ 4, 16, 28],
[ 5, 17, 29],
[ 6, 18, 30],
[ 7, 19, 31],
[ 8, 20, 32],
[ 9, 21, 33],
[10, 22, 34],
[11, 23, 35]])
And here is the second example
x.reshape(x.shape[1]*x.shape[2]*x.shape[3], -1)
# returns:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29],
[30, 31, 32],
[33, 34, 35]])
How the elements get reordered is fundamentally different.
Related
I have a numpy 2D-array:
c = np.arange(36).reshape(6, 6)
[[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]]
I want to split it to multiple 2D-arrays by grid 3x3. (It's like a split big image to 9 small images by grid 3x3):
[[ 0, 1,| 2, 3,| 4, 5],
[ 6, 7,| 8, 9,| 10, 11],
---------+--------+---------
[12, 13,| 14, 15,| 16, 17],
[18, 19,| 20, 21,| 22, 23],
---------+--------+---------
[24, 25,| 26, 27,| 28, 29],
[30, 31,| 32, 33,| 34, 35]]
At final i need array with 9 2D-arrays. Like this:
[[[0, 1], [6, 7]],
[[2, 3], [8, 9]],
[[4, 5], [10, 11]],
[[12, 13], [18, 19]],
[[14, 15], [20, 21]],
[[16, 17], [22, 23]],
[[24, 25], [30, 31]],
[[26, 27], [32, 33]],
[[28, 29], [34, 35]]]
It's just a sample what i need. I want to know how to make small 2D arrays from big 2D array by grid (N,M)
You can use something like:
from numpy.lib.stride_tricks import sliding_window_view
out = np.vstack(sliding_window_view(c, (2, 2))[::2, ::2])
Output:
>>> out.tolist()
[[[0, 1], [6, 7]],
[[2, 3], [8, 9]],
[[4, 5], [10, 11]],
[[12, 13], [18, 19]],
[[14, 15], [20, 21]],
[[16, 17], [22, 23]],
[[24, 25], [30, 31]],
[[26, 27], [32, 33]],
[[28, 29], [34, 35]]]
How can I convert the a array into the b array as they are specified below in Python and using numpy library? I am looking for a very efficient way since my actual array that I want to use this method on is very big. I should mention that the numbers can be any number and there is no relationship among the numbers. Also, I tried to show in the below picture how I want to slice the array.
import numpy as np
a = np.arange(1, 49).reshape(6, 8)
a = [[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16],
[17, 18, 19, 20, 21, 22, 23, 24],
[25, 26, 27, 28, 29, 30, 31, 32],
[33, 34, 35, 36, 37, 38, 39, 40],
[41, 42, 43, 44, 45, 46, 47, 48]]
b =[[1, 2, 9, 10], [2, 3, 10, 11], [3, 4, 11, 12], [4, 5, 12, 13],
[5, 6, 13, 14], [6, 7, 14, 15], [7, 8, 15, 16], [9, 10, 17, 18],
[10, 11, 18, 19], [11, 12, 19, 20], [12, 13, 20, 21], [13, 14, 21, 22],
[14, 15, 22, 23], [15, 16, 23, 24], [17, 18, 25, 26], [18, 19, 26, 27],
[19, 20, 27, 28], [20, 21, 28, 29], [21, 22, 29, 30], [22, 23, 30, 31],
[23, 24, 31, 32], [25, 26, 33, 34], [26, 27, 34, 35], [27, 28, 35, 36],
[28, 29, 36, 37], [29, 30, 37, 38], [30, 31, 38, 39], [31, 32, 39, 40],
[33, 34, 41, 42], [34, 35, 42, 43], [35, 36, 43, 44], [36, 37, 44, 45],
[37, 38, 45, 46], [38, 39, 46, 47], [39, 40, 47, 48]]
I was trying to find a way with reshape and transpose function but the problem is that I could not find a way to include the boundaries. c shows what I was thinking about the solution.
c = a.reshape(3, 2, 4, 2).transpose(0, 2, 3, 1).reshape(3*4, 2*2).
The picture: https://ibb.co/QC7tkPM.
With numpy 1.20 or higher you can use np.lib.stride_tricks.sliding_window_view:
import numpy as np
a = np.arange(12).reshape(3, 4)
print(a)
Gives:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
Below (2,2) is the shape of the sliding window:
windows = np.lib.stride_tricks.sliding_window_view(a, (2,2))
print(windows)
It gives:
[[[[ 0 1]
[ 4 5]]
[[ 1 2]
[ 5 6]]
[[ 2 3]
[ 6 7]]]
[[[ 4 5]
[ 8 9]]
[[ 5 6]
[ 9 10]]
[[ 6 7]
[10 11]]]]
In earlier versions of numpy a similar result can be obtained using np.lib.index_tricks.as_strided:
w = 2 # width and height of the sliding window
r, c = a.shape
size = a.itemsize
ast = np.lib.index_tricks.as_strided
windows_ast = ast(a,
shape=(r - w + 1, c - w + 1, w, w),
strides=(c * size, size, c * size, size))
print(windows_ast)
It gives:
[[[[ 0 1]
[ 4 5]]
[[ 1 2]
[ 5 6]]
[[ 2 3]
[ 6 7]]]
[[[ 4 5]
[ 8 9]]
[[ 5 6]
[ 9 10]]
[[ 6 7]
[10 11]]]]
Note that the numpy documentation warns that np.lib.stride_tricks.as_strided should be avoided if possible, since its results may lead to several issues. The above code may also fail if the input array does not have a contiguous memory layout.
In any case, you can reshape the result to the desired shape:
windows.reshape(-1, 4)
It gives:
array([[ 0, 1, 4, 5],
[ 1, 2, 5, 6],
[ 2, 3, 6, 7],
[ 4, 5, 8, 9],
[ 5, 6, 9, 10],
[ 6, 7, 10, 11]])
Thank you all for your solutions and insights. It just gave me a reason to update my numpy to 1.21.2. The reshape/transpose combo also can work when we are willing to repeat the rows and columns at the boundaries using np.repeat. However, it would be more costly than np.lib.stride_tricks.sliding_window_view. Here is an example:
a = np.arange(1,49).reshape(6,8)
aa = np.repeat(a, repeats=[1,2,2,2,2,1], axis=0)
aa = np.repeat(aa, repeats=[1,2,2,2,2,2,2,1], axis=1)
b = aa.reshape(5,2,7,2).transpose(0,2,3,1).reshape(5*7,2*2)
I have a three dimensional array in this format:
x = [
[[1,2,3,4,5],[6,7,8,9,10]],
[[11,12,13,14,15],[16,17,18,19,20]],
[[21,22,23,24,25],[26,27,28,29,30]],
[[21,22,23,24,25]]
]
I'd like to split it into two, three dimensional arrays in this format:
y = [
[[1,2,3],[6,7,8]],
[[11,12,13],[16,17,18]],
[[21,22,23],[26,27,28]],
[[21,22,23]]
]
z = [
[[4,5],[9,10]],
[[14,15],[19,20]],
[[24,25],[29,30]],
[[24,25]]
]
I came up with this list comprehension for creating y:
[j[:3] for i in x for j in i]
Which returns this:
[[1, 2, 3], [6, 7, 8], [11, 12, 13], [16, 17, 18], [21, 22, 23], [26, 27, 28], [31, 32, 33]]
But as you'll see, it doesn't maintain the same multi-dimensional shape. Does anyone have any ideas?
You need to iterate one level deeper:
x = [[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]], [[11, 12, 13, 14, 15], [16, 17, 18, 19, 20]], [[21, 22, 23, 24, 25], [26, 27, 28, 29, 30]], [[21, 22, 23, 24, 25]]]
y = [[i[:3] for i in b] for b in x]
z = [[i[-2:] for i in b] for b in x]
Output:
[[[1, 2, 3], [6, 7, 8]], [[11, 12, 13], [16, 17, 18]], [[21, 22, 23], [26, 27, 28]], [[21, 22, 23]]]
[[[4, 5], [9, 10]], [[14, 15], [19, 20]], [[24, 25], [29, 30]], [[24, 25]]]
Move your inner most loop into a nested comprehension so that the inner lists are preserved:
y = [[j[:3] for j in i] for i in x]
I have a 2d array like z and a 1d array denoting the "start column position" like starts. In addition I have a fixed row_length = 2
z = np.arange(35).reshape(5, -1)
# --> array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
starts = np.array([1,5,3,3,2])
What I want is the outcome of this slow for-loop, just quicker if possible.
result = np.zeros(
(z.shape[0], row_length),
dtype=z.dtype
)
for i in range(z.shape[0]):
s = starts[i]
result[i] = z[i, s:s+row_length]
So result in this example should look like this in the end:
array([[ 1, 2],
[12, 13],
[17, 18],
[24, 25],
[30, 31]])
I can't seem to find a way using either fancy indexing or np.take to deliver this result.
One approach would be to get those indices using broadcasted additions with those starts and row_length and then use NumPy's advanced-indexing to extract out all of those elements off the data array, like so -
idx = starts[:,None] + np.arange(row_length)
out = z[np.arange(idx.shape[0])[:,None], idx]
Sample run -
In [197]: z
Out[197]:
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
In [198]: starts = np.array([1,5,3,3,2])
In [199]: row_length = 2
In [200]: idx = starts[:,None] + np.arange(row_length)
In [202]: z[np.arange(idx.shape[0])[:,None], idx]
Out[202]:
array([[ 1, 2],
[12, 13],
[17, 18],
[24, 25],
[30, 31]])
I have the following slicing problem in numpy.
a = np.arange(36).reshape(-1,4)
a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31],
[32, 33, 34, 35]])
In my problem always three rows represent one sample, in my case coordinates.
I want to access this matrix in a way that if I use a[0:2] to get the following:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]
These are the first two coordinate samples.
I have to extract a large amount of these coordinate sets from an array.
Thanks
Based on How do you split a list into evenly sized chunks?, I found the following solution, which gives me the desired result.
def chunks(l, n, indices):
return np.vstack([l[idx*n:idx*n+n] for idx in indices])
chunks(a,3,[0,2])
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[24, 25, 26, 27],
[28, 29, 30, 31],
[32, 33, 34, 35]])
Probably this solution could be improved and somebody won't need the stacking.
If three rows are a sample, you can reshape your array to reflect that, use fancy indexing to retrieve your samples, then undo the shape change:
>>> a = a.reshape(-1, 3, 4)
>>> a[[0, 2]].reshape(-1, 4)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[24, 25, 26, 27],
[28, 29, 30, 31],
[32, 33, 34, 35]])