I have two NumPy arrays that I would like to multiply with each other across every row. To illustrate what I mean I have put the code below:
import numpy as np
a = np.array([
[1,2],
[3,4],
[5,6],
[7,8]])
b = np.array([
[1,2],
[4,4],
[5,5],
[7,10]])
final_product=[]
for i in range(0,b.shape[0]):
product=a[i,:]*b
final_product.append(product)
Rather than using loops and lists, is there more direct, faster and elegant way of doing the above row-wise multiplication in NumPy?
By using proper reshaping and repetition you can achieve what you are looking for, here is a simple implementation:
a.reshape(4,1,2) * ([b]*4)
If the length is dynamic you can do this:
a.reshape(a.shape[0],1,a.shape[1]) * ([b]*a.shape[0])
Note : Make sure a.shape[1] and b.shape[1] remains equal, while a.shape[0] and b.shape[0] can differ.
This type of problems can be handled by np.einsum(see Doc & this post) for more understanding. It is one of the most efficient ways in this regard:
np.einsum("ij, kj->ikj", a, b)
Try:
n = b.shape[0]
print(np.multiply(np.repeat(a, n, axis=0).reshape((a.shape[0], n, -1)), b))
Prints:
[[[ 1 4]
[ 4 8]
[ 5 10]
[ 7 20]]
[[ 3 8]
[12 16]
[15 20]
[21 40]]
[[ 5 12]
[20 24]
[25 30]
[35 60]]
[[ 7 16]
[28 32]
[35 40]
[49 80]]]
Related
I have two numpy arrays A and B, both with the dimension [2,2,n], where n is a very large number. I want to matrix multiply A and B in the first two dimensions to get C, i.e. C=AB, where C has the dimension [2,2,n].
The simplest way to accomplish this is by using for loop, i.e.
for i in range(n):
C[:,:,i] = np.matmul(A[:,:,i],B[:,:,i])
However, this is inefficient since n is very large. What's the most efficient way to do this with numpy?
You can do the following:
new_array = np.einsum('ijk,jlk->ilk', A, B)
What you want is the the default array multiplication in Numpy
In [22]: a = np.arange(8).reshape((2,2,2))+1 ; a[:,:,0], a[:,:,1]
Out[22]:
(array([[1, 3],
[5, 7]]),
array([[2, 4],
[6, 8]]))
In [23]: aa = a*a ; aa[:,:,0], aa[:,:,1]
Out[23]:
(array([[ 1, 9],
[25, 49]]),
array([[ 4, 16],
[36, 64]]))
Notice that I emphasized array because Numpy's arrays look like matrices but are indeed Numpy's ndarrays.
Post Scriptum
I guess that what you really want are matricesarrays with shape (n,2,2), so that you can address individual 2×2 matrices using a single index, e.g.,
In [27]: n = 3
...: a = np.arange(n*2*2)+1 ; a_22n, a_n22 = a.reshape((2,2,n)), a.reshape((n,2,2))
...: print(a_22n[0])
...: print(a_n22[0])
[[1 2 3]
[4 5 6]]
[[1 2]
[3 4]]
Post Post Scriptum
Re semantically correct:
In [13]: import numpy as np
...: n = 3
...: a = np.arange(2*2*n).reshape((2,2,n))+1
...: p = lambda t,a,n:print(t,*(a[:,:,i]for i in range(n)),sep=',\n')
...: p('Original array', a, n)
...: p('Using `einsum("ijk,jlk->ilk", ...)`', np.einsum('ijk,jlk->ilk', a, a), n)
...: p('Using standard multiplication', a*a, n)
Original array,
[[ 1 4]
[ 7 10]],
[[ 2 5]
[ 8 11]],
[[ 3 6]
[ 9 12]]
Using `einsum("ijk,jlk->ilk", ...)`,
[[ 29 44]
[ 77 128]],
[[ 44 65]
[104 161]],
[[ 63 90]
[135 198]]
Using standard multiplication,
[[ 1 16]
[ 49 100]],
[[ 4 25]
[ 64 121]],
[[ 9 36]
[ 81 144]]
I'm new to python and I'm trying to find the best way to transform my array.
I have two arrays, A and B. I want to add them together such that every value of array A is added to two values of array B
A = np.array(2, 4, 6, 8, 10)
B = np.array(10, 10, 10, 10, 10, 10, 10, 10, 10, 10)
combining the two would give me array C as
C = np.array(12, 12, 14, 14, 16, 16, 18, 18, 20, 20)
I though maybe a for loop might achieve this, but I'm not sure how to specify to apply each value of array A twice before continuing. Any help would be appreciated thank you so much!
It's sort of hacky and not quite for a beginner:
(A[None, :] + B.reshape((2, -1))).reshape(-1)
A[None, :] treats A as a 1x5 array. B.reshape((2, -1)) treats B as a 2x5 array. Python knows how to add 1x5 arrays to 2x5 arrays via broadcasting. The final reshape turns the 2x5 array back into a 10-element array.
-1 in a reshape says "make this dimension as large as necessary to use all the data." That way, I don't have to bake 2x5 into the code, but this will work for any n-element and 2-n element arrays.
You could reshape, add and reshape back:
import numpy as np
A = np.array([2, 4, 6, 8, 10])
B = np.array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10])
res = (B.reshape((-1, 2)) + A[:, None]).reshape(-1)
print(res)
Output
[12 12 14 14 16 16 18 18 20 20]
The expression:
B.reshape((-1, 2))
creates the following array:
[[10 10]
[10 10]
[10 10]
[10 10]
[10 10]]
basically you tell numpy fit the array in a 2 by N, column where N is determined by the original size of B (this is all due to the -1). The other part:
A[:, None]
creates:
[[ 2]
[ 4]
[ 6]
[ 8]
[10]]
Then using broadcasting you can add them together. Finally reshape back.
You could use slicing to index every other item from A then augmented addition for in-place transformation. I don't know the details of how numpy will handle the slicing, but I think this will use the least memory.
B[::2] += A
B[1::2] += A
Another way to expand and reshape is
B += np.array([A, A]).flatten("F")
It loooks to me that this will use 4x the size of A but I think all of the reshaping methods will eat memory.
I have to matrices:
a = np.array([[6],[3],[4]])
b = np.array([1,10])
when I do:
c = a * b
c looks like this:
[ 6, 60]
[ 3, 30]
[ 4, 40]
which is good.
now, lets say I add a column to a (for the sake of the example its an identical column. but it dosent have to be):
a = np.array([[6,6],[3,3],[4,4]])
b stayes the same.
the result I want is 2 identical copies of c (since the column are identical), stacked along a new axis:
new_c.shape == [3,2,2]
when if u do new_c[:,:,0] or new_c[:,:,1] you get the original c.
I tried adding new axes to both a and b using np.expand_dims but it did not help.
One way is using numpy.einsum:
>>> import numpy as np
>>> a = np.array([[6],[3],[4]])
>>> b = np.array([1,10])
>>> print(a * b)
[[ 6 60]
[ 3 30]
[ 4 40]]
>>> print(np.einsum('ij, j -> ij', a, b))
[[ 6 60]
[ 3 30]
[ 4 40]]
>>> a = np.array([[6,6],[3,3],[4,4]])
>>> print(np.einsum('ij, k -> ikj', a, b)[:, :, 0])
>>> print(np.einsum('ij, k -> ikj', a, b)[:, :, 1])
[[ 6 60]
[ 3 30]
[ 4 40]]
[[ 6 60]
[ 3 30]
[ 4 40]]
For more usage about numpy.einsum, I recommend:
Understanding NumPy's einsum
You have multiple options here, one of which is using numpy.einsum as explained in the other answer. Another possibility is using array reshape method:
result = a.T.reshape((a.shape[1], a.shape[0], 1)) * b
result = result.reshape((-1, 2))
result
array([[ 6, 60],
[ 3, 30],
[ 4, 40],
[ 6, 60],
[ 3, 30],
[ 4, 40]])
Yet what is more intuitive to me is to stack arrays by mean of np.vstack with each column of a multiplied by b as follows:
result = np.vstack([c[:, None] * b for c in a.T])
result
array([[ 6, 60],
[ 3, 30],
[ 4, 40],
[ 6, 60],
[ 3, 30],
[ 4, 40]])
I have a 3D numpy array:
K = (np.arange(36)).reshape((4,3,3))+1
[[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]]
[[10 11 12]
[13 14 15]
[16 17 18]]
[[19 20 21]
[22 23 24]
[25 26 27]]
[[28 29 30]
[31 32 33]
[34 35 36]]]
where each item in K is a matrix.
Now, I want to get all 2D submatrix using a certain index vector
I know that it is possible in this way:
idx = np.s_[:,:2,:2]
K_sub = K[idx]
[[[ 1 2]
[ 4 5]]
[[10 11]
[13 14]]
[[19 20]
[22 23]]
[[28 29]
[31 32]]]
The problem is that I want to use an arbitrary indexing array and not slicing to select rows and cols.
Moreover, I want to use a single object to get the list of submatrices, something like:
K_sub = [magic_indexing]
and not:
K_sub = np.array([k_[train][:,train] for k_ in K])
Exists a simple way to do it?
Not sure if it's simply enough for you, but one way would be with np.ix_ and thus uses advanced-indexing, like so -
K[np.ix_(np.arange(K.shape[0]), train, train)]
I have the following array:
import numpy as np
a = np.array([[ 1, 2, 3],
[ 1, 2, 3],
[ 1, 2, 3]])
I understand that np.random.shuffle(a.T) will shuffle the array along the row, but what I need is for it to shuffe each row idependently. How can this be done in numpy? Speed is critical as there will be several million rows.
For this specific problem, each row will contain the same starting population.
import numpy as np
np.random.seed(2018)
def scramble(a, axis=-1):
"""
Return an array with the values of `a` independently shuffled along the
given axis
"""
b = a.swapaxes(axis, -1)
n = a.shape[axis]
idx = np.random.choice(n, n, replace=False)
b = b[..., idx]
return b.swapaxes(axis, -1)
a = a = np.arange(4*9).reshape(4, 9)
# array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
# [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
# [18, 19, 20, 21, 22, 23, 24, 25, 26],
# [27, 28, 29, 30, 31, 32, 33, 34, 35]])
print(scramble(a, axis=1))
yields
[[ 3 8 7 0 4 5 1 2 6]
[12 17 16 9 13 14 10 11 15]
[21 26 25 18 22 23 19 20 24]
[30 35 34 27 31 32 28 29 33]]
while scrambling along the 0-axis:
print(scramble(a, axis=0))
yields
[[18 19 20 21 22 23 24 25 26]
[ 0 1 2 3 4 5 6 7 8]
[27 28 29 30 31 32 33 34 35]
[ 9 10 11 12 13 14 15 16 17]]
This works by first swapping the target axis with the last axis:
b = a.swapaxes(axis, -1)
This is a common trick used to standardize code which deals with one axis.
It reduces the general case to the specific case of dealing with the last axis.
Since in NumPy version 1.10 or higher swapaxes returns a view, there is no copying involved and so calling swapaxes is very quick.
Now we can generate a new index order for the last axis:
n = a.shape[axis]
idx = np.random.choice(n, n, replace=False)
Now we can shuffle b (independently along the last axis):
b = b[..., idx]
and then reverse the swapaxes to return an a-shaped result:
return b.swapaxes(axis, -1)
If you don't want a return value and want to operate on the array directly, you can specify the indices to shuffle.
>>> import numpy as np
>>>
>>>
>>> a = np.array([[1,2,3], [1,2,3], [1,2,3]])
>>>
>>> # Shuffle row `2` independently
>>> np.random.shuffle(a[2])
>>> a
array([[1, 2, 3],
[1, 2, 3],
[3, 2, 1]])
>>>
>>> # Shuffle column `0` independently
>>> np.random.shuffle(a[:,0])
>>> a
array([[3, 2, 3],
[1, 2, 3],
[1, 2, 1]])
If you want a return value as well, you can use numpy.random.permutation, in which case replace np.random.shuffle(a[n]) with a[n] = np.random.permutation(a[n]).
Warning, do not do a[n] = np.random.shuffle(a[n]). shuffle does not return anything, so the row/column you end up "shuffling" will be filled with nan instead.
Good answer above. But I will throw in a quick and dirty way:
a = np.array([[1,2,3], [1,2,3], [1,2,3]])
ignore_list_outpput = [np.random.shuffle(x) for x in a]
Then, a can be something like this
array([[2, 1, 3],
[4, 6, 5],
[9, 7, 8]])
Not very elegant but you can get this job done with just one short line.
Building on my comment to #Hun's answer, here's the fastest way to do this:
def shuffle_along(X):
"""Minimal in place independent-row shuffler."""
[np.random.shuffle(x) for x in X]
This works in-place and can only shuffle rows. If you need more options:
def shuffle_along(X, axis=0, inline=False):
"""More elaborate version of the above."""
if not inline:
X = X.copy()
if axis == 0:
[np.random.shuffle(x) for x in X]
if axis == 1:
[np.random.shuffle(x) for x in X.T]
if not inline:
return X
This, however, has the limitation of only working on 2d-arrays. For higher dimensional tensors, I would use:
def shuffle_along(X, axis=0, inline=True):
"""Shuffle along any axis of a tensor."""
if not inline:
X = X.copy()
np.apply_along_axis(np.random.shuffle, axis, X) # <-- I just changed this
if not inline:
return X
You can do it with numpy without any loop or extra function, and much more faster. E. g., we have an array of size (2, 6) and we want a sub array (2,2) with independent random index for each column.
import numpy as np
test = np.array([[1, 1],
[2, 2],
[0.5, 0.5],
[0.3, 0.3],
[4, 4],
[7, 7]])
id_rnd = np.random.randint(6, size=(2, 2)) # select random numbers, use choice and range if don want replacement.
new = np.take_along_axis(test, id_rnd, axis=0)
Out:
array([[2. , 2. ],
[0.5, 2. ]])
It works for any number of dimensions.
As of NumPy 1.20.0 released in January 2021 we have a permuted() method on the new Generator type (introduced with the new random API in NumPy 1.17.0, released in July 2019). This does exactly what you need:
import numpy as np
rng = np.random.default_rng()
a = np.array([
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
])
shuffled = rng.permuted(a, axis=1)
This gives you something like
>>> print(shuffled)
[[2 3 1]
[1 3 2]
[2 1 3]]
As you can see, the rows are permuted independently. This is in sharp contrast with both rng.permutation() and rng.shuffle().
If you want an in-place update you can pass the original array as the out keyword argument. And you can use the axis keyword argument to choose the direction along which to shuffle your array.