import numpy as np
A = np.array(
[ [ [45, 12, 4], [45, 13, 5], [46, 12, 6] ],
[ [46, 14, 4], [45, 14, 5], [46, 11, 5] ],
[ [47, 13, 2], [48, 15, 5], [52, 15, 1] ] ])
print(A[1:3, 0:2])
Please explain this. I have been struggling to understand
When accessing a 3D array this way, what you are acutally asking for is to cut a part of each nesting level of those arrays:
A[1:3, 0:2, 0:3]
# ↑↑↑
# Of the outer array (the outer []), take elements 1 (inclusive) to 3 (exclusive).
# Mind that counting starts at 0, so this is the second and third line in your example
A[1:3, 0:2, 0:3]
# ↑↑↑
# Out of the second level array, take the elements 0 (inclusive) to 2 (exclusive).
# This is the first and the second group of three numbers each
A[1:3, 0:2, 0:3]
# ↑↑↑
# This you did not specify, but it is added automatically
# Of the third level arrays, take element 0 (inclusive) to 3 (exclusive)
# Those arrays only have 3 numbers each, so they are left untouched.
In [483]: A = np.array(
...: [ [ [45, 12, 4], [45, 13, 5], [46, 12, 6] ],
...: [ [46, 14, 4], [45, 14, 5], [46, 11, 5] ],
...: [ [47, 13, 2], [48, 15, 5], [52, 15, 1] ] ])
The whole 3d array. If you need to put names on the dimensions, I'd suggest 'plane', 'row' and 'column':
In [484]: A
Out[484]:
array([[[45, 12, 4],
[45, 13, 5],
[46, 12, 6]],
[[46, 14, 4],
[45, 14, 5],
[46, 11, 5]],
[[47, 13, 2],
[48, 15, 5],
[52, 15, 1]]])
In [485]: A.shape
Out[485]: (3, 3, 3)
Taking a slice on the first dimension (the last 2 planes):
In [486]: A[1:3]
Out[486]:
array([[[46, 14, 4],
[45, 14, 5],
[46, 11, 5]],
[[47, 13, 2],
[48, 15, 5],
[52, 15, 1]]])
Taking 2 rows from each of those planes:
In [487]: A[1:3, 0:2]
Out[487]:
array([[[46, 14, 4],
[45, 14, 5]],
[[47, 13, 2],
[48, 15, 5]]])
The last dimension, columns, is left whole, the equivalent of A[1:3, 0:2, :] (trailing slices are automatic).
3D slicing is just the same as 1d and 2d (and 4d etc). There's nothing special or really different about 3d.
Related
I have to np arrays
a = np.array[[1,2]
[2,3]
[3,4]
[5,6]]
b = np.array [[2,4]
[6,8]
[10,11]
I want to multiple each row of a against each element in array b so that array c is created with dimensions of a-rows x b rows (as columns)
c = np.array[[2,8],[6,16],[10,22]
[4,12],[12,21],[20,33]
....]
There are other options for doing this, but I would really like to leverage the speed of numpy's ufuncs...if possible.
any and all help is appreciated.
Does this do what you want?
>>> a
array([[1, 2],
[2, 3],
[3, 4],
[5, 6]])
>>> b
array([[ 2, 4],
[ 6, 8],
[10, 11]])
>>> a[:,None,:]*b
array([[[ 2, 8],
[ 6, 16],
[10, 22]],
[[ 4, 12],
[12, 24],
[20, 33]],
[[ 6, 16],
[18, 32],
[30, 44]],
[[10, 24],
[30, 48],
[50, 66]]])
>>> _.shape
(4, 3, 2)
Or if that doesn't have the right shape, you can reshape it:
>>> (a[:,None,:]*b).reshape((a.shape[0]*b.shape[0], 2))
array([[ 2, 8],
[ 6, 16],
[10, 22],
[ 4, 12],
[12, 24],
[20, 33],
[ 6, 16],
[18, 32],
[30, 44],
[10, 24],
[30, 48],
[50, 66]])
I have two matrices (numpy arrays), mu and nu. From these I would like to create a third array as follows:
new_array_{j, k, l} = mu_{l, k} nu_{j, k}
I can do it naively using list comprehensions:
[[[mu[l, k] * nu[j, k] for k in np.arange(N)] for l in np.arange(N)] for j in np.arange(N)]
but it quickly becomes slow.
How can I create new_array using numpy functions which should be faster?
Two quick solutions (without my usual proofs and explanations):
res = np.einsum('lk,jk->jkl', mu, nu)
res = mu.T[None,:,:] * nu[:,:,None] # axes in same order as result
#!/usr/bin/env python
import numpy as np
# example data
mu = np.arange(10).reshape(2,5)
nu = np.arange(15).reshape(3,5) + 20
# get array sizes
nl, nk = mu.shape
nj, nk_ = nu.shape
assert(nk == nk_)
# get arrays with dimensions (nj, nk, nl)
# in the case of mu3d, we need to add a slowest varying dimension
# so (after transposing) this can be done by cycling through the data
# nj times along the slowest existing axis and then reshaping
mu3d = np.concatenate((mu.transpose(),) * nj).reshape(nj, nk, nl)
# in the case of nu3d, we need to add a new fastest varying dimension
# so this can be done by repeating each element nl times, and again it
# needs reshaping
nu3d = nu.repeat(nl).reshape(nj, nk, nl)
# now just multiple element by element
new_array = mu3d * nu3d
print(new_array)
Gives:
>>> mu
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> nu
array([[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
>>> nj, nk, nl
(3, 5, 2)
>>> mu3d
array([[[0, 5],
[1, 6],
[2, 7],
[3, 8],
[4, 9]],
[[0, 5],
[1, 6],
[2, 7],
[3, 8],
[4, 9]],
[[0, 5],
[1, 6],
[2, 7],
[3, 8],
[4, 9]]])
>>> nu3d
array([[[20, 20],
[21, 21],
[22, 22],
[23, 23],
[24, 24]],
[[25, 25],
[26, 26],
[27, 27],
[28, 28],
[29, 29]],
[[30, 30],
[31, 31],
[32, 32],
[33, 33],
[34, 34]]])
>>> new_array
array([[[ 0, 100],
[ 21, 126],
[ 44, 154],
[ 69, 184],
[ 96, 216]],
[[ 0, 125],
[ 26, 156],
[ 54, 189],
[ 84, 224],
[116, 261]],
[[ 0, 150],
[ 31, 186],
[ 64, 224],
[ 99, 264],
[136, 306]]])
I am building a neural network. where I have to flatten my training dataset.
I have two options.
1 is:
train_x_flatten = train_x_orig.reshape(train_x_orig.shape[0], -1).T
and 2nd one is:
train_x_flatten = train_x_orig.reshape(train_x_orig.shape[1]*train_x_orig.shape[2]*train_x_orig.shape[3], 209)
both gave the same shape but I found difference while computing cost?
why is that? thank you
Your original tensor is of at least rank 4 based on the second example. The first example pulls each element, ordered by increasing the right-most index, and inserts the elements into rows the length of the zeroth shape. Then transposes.
The second example again pull elements from by incrementing from the right-most index, i.e.:
element = train_x_orig[0, 0, 0, 0]
new_row.append(element)
element = train_x_orig[0, 0, 0, 1]
new_row.append(element)
but the size of the row is different. It is now the dimension of everything else in the tensor.
Here is an example to illustrate.
First we create an ordered array and reshape it to rank 4.
import numpy as np
x = np.arange(36).reshape(3,2,3,2)
x
# returns:
array([[[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]]],
[[[12, 13],
[14, 15],
[16, 17]],
[[18, 19],
[20, 21],
[22, 23]]],
[[[24, 25],
[26, 27],
[28, 29]],
[[30, 31],
[32, 33],
[34, 35]]]])
Here is the output of the first example
x.reshape(x.shape[0], -1).T
# returns:
array([[ 0, 12, 24],
[ 1, 13, 25],
[ 2, 14, 26],
[ 3, 15, 27],
[ 4, 16, 28],
[ 5, 17, 29],
[ 6, 18, 30],
[ 7, 19, 31],
[ 8, 20, 32],
[ 9, 21, 33],
[10, 22, 34],
[11, 23, 35]])
And here is the second example
x.reshape(x.shape[1]*x.shape[2]*x.shape[3], -1)
# returns:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29],
[30, 31, 32],
[33, 34, 35]])
How the elements get reordered is fundamentally different.
I have a 3D array a of data and a 2D array b of indices. I need to take a sub-array of a along the 3rd axis, using the indices from b. I can do it with take like this:
a = np.arange(24).reshape((2,3,4))
b = np.array([0,2,1,3]).reshape((2,2))
np.array([np.take(a_,b_,axis=1) for (a_,b_) in zip(a,b)])
Can I do it without list comprehension, using some fancy indexing? I am worried about efficiency, so if fancy indexing is not more efficient in this case, I would like to know it.
EDIT The 1st thing I've tried is a[[0,1],:,b] but it doesn't give the sub-array I need
In [317]: a
Out[317]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [318]: a = np.arange(24).reshape((2,3,4))
...: b = np.array([0,2,1,3]).reshape((2,2))
...: np.array([np.take(a_,b_,axis=1) for (a_,b_) in zip(a,b)])
...:
Out[318]:
array([[[ 0, 2],
[ 4, 6],
[ 8, 10]],
[[13, 15],
[17, 19],
[21, 23]]])
So you want the 0 & 2 columns from the 1st block, and 1 & 3 from the second.
Make a c that matches b in shape, and embodies this observation
In [319]: c=np.array([[0,0],[1,1]])
In [320]: c
Out[320]:
array([[0, 0],
[1, 1]])
In [321]: b
Out[321]:
array([[0, 2],
[1, 3]])
In [322]: a[c,:,b]
Out[322]:
array([[[ 0, 4, 8],
[ 2, 6, 10]],
[[13, 17, 21],
[15, 19, 23]]])
That's the right numbers, but not the right shape.
A column vector can be used instead of c.
In [323]: a[np.arange(2)[:,None],:,b] # or a[[[0],[1]],:,b]
Out[323]:
array([[[ 0, 4, 8],
[ 2, 6, 10]],
[[13, 17, 21],
[15, 19, 23]]])
As for the shape, we can transpose the last two axes
In [324]: a[np.arange(2)[:,None],:,b].transpose(0,2,1)
Out[324]:
array([[[ 0, 2],
[ 4, 6],
[ 8, 10]],
[[13, 15],
[17, 19],
[21, 23]]])
This transpose is required because we have a slice between two index arrays, a mix of basic and advanced indexing. It's documented, but never the less often puzzling. It put the slice dimension (3) last, and we have to transpose it back.
Nice little indexing puzzle!
The latest question and explanation of this advanced/basic transpose:
Indexing numpy multidimensional arrays depends on a slicing method
This is my first try. I will see if I can do better.
#using numpy broadcasting.
np.r_[a[0][:,b[0]],a[1][:,b[1]]].reshape(2,3,2)
Out[300]: In [301]:
array([[[ 0, 2],
[ 4, 6],
[ 8, 10]],
[[13, 15],
[17, 19],
[21, 23]]])
Second try:
#convert both a and b to a 2d array and then slice all rows and only columns determined by b.
a.reshape(6,4)[np.arange(6)[:,None],b.repeat(3,0)].reshape(2,3,2)
Out[429]:
array([[[ 0, 2],
[ 4, 6],
[ 8, 10]],
[[13, 15],
[17, 19],
[21, 23]]])
I have many 3*2 matrices(A1,A2,A3..), and each of the 3*2 is a draw. In the case two draws, we have a 3*4 ( we horizontally stack each draw of A1,A2). Clearly, it is easier for me to draw the 3*4 matrix (A) as a larger matrices once instead of draw a 3*2 over and over again.
But I need to perform a matrix multiplication for each draw(each A1,A2...) to a matrix B. Say A1*B, and A2*B ...AN*B
#each draw of the 3*2 matrix
A1 = np.array([[ 0, 1],
[ 4, 5],
[ 8, 9]])
A2 = np.array([[ 2, 3],
[ 6, 7],
[ 10, 11]])
# A is [A1,A2]
# Easier to draw A once for all (the larger matrix)
A = np.array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
b = np.array([[ 0, 1],
[ 4, 5]
])
desired output
array([[ 4, 5, 12, 17],
[20, 29, 28, 41],
[36, 53, 44, 65]])
You can reshape matrix A to 2 columns so that it is conformable to b, do the matrix multiplication, and then reshape it back:
np.dot(A.reshape(-1, 2), b).reshape(3, -1)
#array([[ 4, 5, 12, 17],
# [20, 29, 28, 41],
# [36, 53, 44, 65]])
If you are unsure about how to store/stack the incoming arrays, one way would be stacking those as a 3D array, such that the each of those incoming arrays are index-able by its first axis -
a = np.array((A1,A2))
Sample run -
In [143]: a = np.array((A1,A2))
In [144]: a.shape
Out[144]: (2, 3, 2)
|-----------------> axis of stacking
Then, to get the equivalent output of matrix-multiplications of each incoming array with b, we could use np.tensordot on the 3D stacked array a with b, thus losing the last axis from a and first from b in the sum-reduction, like so -
out = np.tensordot(a,b,axes=((2),(0)))
Let's have a look at the output values and compare against each matrix-multiplication with A1, A2, etc. -
In [138]: out[0]
Out[138]:
array([[ 4, 5],
[20, 29],
[36, 53]])
In [139]: out[1]
Out[139]:
array([[12, 17],
[28, 41],
[44, 65]])
In [140]: A1.dot(b)
Out[140]:
array([[ 4, 5],
[20, 29],
[36, 53]])
In [141]: A2.dot(b)
Out[141]:
array([[12, 17],
[28, 41],
[44, 65]])
Thus, essentially with this stacking operation and later on tensordot we have :
out[0], out[1], .... = A1.dot(b), A2.dot(b), ....
Alternative to np.tensordot -
We could use a simpler version with np.matmul, to get the same output as with tensordot -
out = np.matmul(a,b)
On Python 3.5, there's an even simpler version that replaces np.matmul, the # operator -
out = a # b
Even if not needed for the calculation einsum can help us think through the problem:
In [584]: np.einsum('ij,jk->ik', A1,b)
Out[584]:
array([[ 4, 5],
[20, 29],
[36, 53]])
In [585]: np.einsum('ij,jk->ik', A2,b)
Out[585]:
array([[12, 17],
[28, 41],
[44, 65]])
A is (3,4), which won't work with the (2,2) b. Think of it as trying work with a doubled j dimension: 'i(2j),jk->i?k'. But what if we inserted an axis? 'imk,jk->imk'? Or added the extra dimension to i?
In [587]: np.einsum('imj,jk->imk', A.reshape(3,2,2),b)
Out[587]:
array([[[ 4, 5],
[12, 17]],
[[20, 29],
[28, 41]],
[[36, 53],
[44, 65]]])
The numbers are there, just the shape is (3,2,2).
In [590]: np.einsum('imj,jk->imk', A.reshape(3,2,2),b).reshape(3,4)
Out[590]:
array([[ 4, 5, 12, 17],
[20, 29, 28, 41],
[36, 53, 44, 65]])
Or you could build A from the start so that mij,jk->mik works (#Divaker)
#Psidom:
np.einsum('ij,jk->ik', A.reshape(3,2,2).reshape(-1,2) ,b).reshape(3,-1)
`#piRSquared':
'kj,jI->kI`
Shift you perspective. You are locking yourself into 3 x 2 unnecessarily.
You can think of A1 and A2 as 2x3 instead, then A would be
array([[ 0, 4, 8, 2, 6, 10],
[ 1, 5, 9, 3, 7, 11]])
Then take the transpose of b = b.T
array([[0, 4],
[1, 5]])
So that you can do you operation
b # A
array([[ 4, 20, 36, 12, 28, 44],
[ 5, 29, 53, 17, 41, 65]])
Let your "draws" look like this
A = np.random.randint(10, size=(2, 9))
A
array([[7, 2, 1, 0, 9, 9, 1, 0, 2],
[8, 6, 1, 6, 6, 2, 4, 2, 9]])
b # A
array([[32, 24, 4, 24, 24, 8, 16, 8, 36],
[47, 32, 6, 30, 39, 19, 21, 10, 47]])