Given two matrices
A: m * r
B: n * r
I want to generate another matrix C: m * n, with each entry C_ij being a matrix calculated by the outer product of A_i and B_j.
For example,
A: [[1, 2],
[3, 4]]
B: [[3, 1],
[1, 2]]
gives
C: [[[3, 1], [[1 ,2],
[6, 2]], [2 ,4]],
[9, 3], [[3, 6],
[12,4]], [4, 8]]]
I can do it using for loops, like
for i in range (A.shape(0)):
for j in range (B.shape(0)):
C_ij = np.outer(A_i, B_j)
I wonder If there is a vectorised way of doing this calculation to speed it up?
The Einstein notation expresses this problem nicely
In [85]: np.einsum('ac,bd->abcd',A,B)
Out[85]:
array([[[[ 3, 1],
[ 6, 2]],
[[ 1, 2],
[ 2, 4]]],
[[[ 9, 3],
[12, 4]],
[[ 3, 6],
[ 4, 8]]]])
temp = numpy.multiply.outer(A, B)
C = numpy.swapaxes(temp, 1, 2)
NumPy ufuncs, such as multiply, have an outer method that almost does what you want. The following:
temp = numpy.multiply.outer(A, B)
produces a result such that temp[a, b, c, d] == A[a, b] * B[c, d]. You want C[a, b, c, d] == A[a, c] * B[b, d]. The swapaxes call rearranges temp to put it in the order you want.
Simple Solution with Numpy Array Broadcasting
Since, you want C_ij = A_i * B_j, this can be achieved simply by numpy broadcasting on element-wise-product of column-vector-A and row-vector-B, as shown below:
# import numpy as np
# A = [[1, 2], [3, 4]]
# B = [[3, 1], [1, 2]]
A, B = np.array(A), np.array(B)
C = A.reshape(-1,1) * B.reshape(1,-1)
# same as:
# C = np.einsum('i,j->ij', A.flatten(), B.flatten())
print(C)
Output:
array([[ 3, 1, 1, 2],
[ 6, 2, 2, 4],
[ 9, 3, 3, 6],
[12, 4, 4, 8]])
You could then get your desired four sub-matrices by using numpy.dsplit() or numpy.array_split() as follows:
np.dsplit(C.reshape(2, 2, 4), 2)
# same as:
# np.array_split(C.reshape(2,2,4), 2, axis=2)
Output:
[array([[[ 3, 1],
[ 6, 2]],
[[ 9, 3],
[12, 4]]]),
array([[[1, 2],
[2, 4]],
[[3, 6],
[4, 8]]])]
Use numpy;
In [1]: import numpy as np
In [2]: A = np.array([[1, 2], [3, 4]])
In [3]: B = np.array([[3, 1], [1, 2]])
In [4]: C = np.outer(A, B)
In [5]: C
Out[5]:
array([[ 3, 1, 1, 2],
[ 6, 2, 2, 4],
[ 9, 3, 3, 6],
[12, 4, 4, 8]])
Once you have the desired result, you can use numpy.reshape() to mold it in almost any shape you want;
In [6]: C.reshape([4,2,2])
Out[6]:
array([[[ 3, 1],
[ 1, 2]],
[[ 6, 2],
[ 2, 4]],
[[ 9, 3],
[ 3, 6]],
[[12, 4],
[ 4, 8]]])
Related
To simplify my question, let's say I have these arrays:
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([[2, 2, 2], [3, 3, 3]])
c = np.array([[1, 1, 3], [4, 1, 6]])
I would like to use element-wise multiplication on them so the result will be:
array([[ 2, 4, 18],
[ 48, 15, 108]])
I know I can do a*b*c, but that won't work if I have many 2d arrays or if I don't know the number of arrays. I am also aware of numpy.multiply but that works for only 2 arrays.
Use stack and prod.
stack will create an array which can be reduced by prod along an axis.
import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([[2, 2, 2], [3, 3, 3]])
c = np.array([[1, 1, 3], [4, 1, 6]])
unknown_length_list_of_arrays = [a, b, c]
d1 = a * b * c
stacked = np.stack(unknown_length_list_of_arrays)
d2 = np.prod(stacked, axis=0)
np.testing.assert_equal(d1, d2)
Suppose I have two NumPy arrays
x = [[1, 2, 8],
[2, 9, 1],
[3, 8, 9],
[4, 3, 5],
[5, 2, 3],
[6, 4, 7],
[7, 2, 3],
[8, 2, 2],
[9, 5, 3],
[10, 2, 3],
[11, 2, 4]]
y = [0, 0, 1, 0, 1, 1, 2, 2, 2, 0, 0]
Note:
(values in x are not sorted in any way. I chose this example to better illustrate the example)
(These are just two examples of x and y. values of x and y can be arbitrarily many different numbers and y can have arbitrarily different numbers, but there are always as many values in x as there are in y)
I want to efficiently split the array x into sub-arrays according to the values in y.
My desired outputs would be
z_0 = [[1, 2, 8],
[2, 9, 1],
[4, 3, 5],
[10, 2, 3],
[11, 2, 4]]
z_1 = [[3, 8, 9],
[5, 2, 3],
[6, 4, 7],]
z_2 = [[7, 2, 3],
[8, 2, 2],
[9, 5, 3]]
Assuming that y starts with zero and is not sorted but grouped, what is the most efficient way to do this?
Note: This question is the unsorted version of this question:
Split a NumPy array into subarrays according to the values (sorted in ascending order) of another array
One way to solve this is to build up a list of filter indexes for each y value and then simply select those elements of x. For example:
z_0 = x[[i for i, v in enumerate(y) if v == 0]]
z_1 = x[[i for i, v in enumerate(y) if v == 1]]
z_2 = x[[i for i, v in enumerate(y) if v == 2]]
Output
array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]])
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]])
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])
If you want to be more generic and support different sets of numbers in y, you could use a comprehension to produce a list of arrays e.g.
z = [x[[i for i, v in enumerate(y) if v == m]] for m in set(y)]
Output:
[array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]]),
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]]),
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])]
If y is also an np.array and the same length as x you can simplify this to use boolean indexing:
z = [x[y==m] for m in set(y)]
Output is the same as above.
Just use list comprehension and boolean indexing
x = np.array(x)
y = np.array(y)
z = [x[y == i] for i in range(y.max() + 1)]
z
Out[]:
[array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]]),
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]]),
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])]
Slight variation.
from operator import itemgetter
label = itemgetter(1)
Associate the implied information with the label ... (index,label)
y1 = [thing for thing in enumerate(y)]
Sort on the label
y1.sort(key=label)
Group by label and construct the results
import itertools
d = {}
for key,group in itertools.groupby(y1,label):
d[f'z{key}'] = [x[i] for i,k in group]
Pandas solution:
>>> import pandas as pd
>>> >>> df = pd.DataFrame({'points':[thing for thing in x],'cat':y})
>>> z = df.groupby('cat').agg(list)
>>> z
points
cat
0 [[1, 2, 8], [2, 9, 1], [4, 3, 5], [10, 2, 3], ...
1 [[3, 8, 9], [5, 2, 3], [6, 4, 7]]
2 [[7, 2, 3], [8, 2, 2], [9, 5, 3]]
Suppose I have a matrix A with some arbitrary values:
array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
And a matrix B which contains indices of elements in A:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
How do I select values from A pointed by B, i.e.:
A[B] = [[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]]
EDIT: np.take_along_axis is a builtin function for this use case implemented since numpy 1.15. See #hpaulj 's answer below for how to use it.
You can use NumPy's advanced indexing -
A[np.arange(A.shape[0])[:,None],B]
One can also use linear indexing -
m,n = A.shape
out = np.take(A,B + n*np.arange(m)[:,None])
Sample run -
In [40]: A
Out[40]:
array([[2, 4, 5, 3],
[1, 6, 8, 9],
[8, 7, 0, 2]])
In [41]: B
Out[41]:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
In [42]: A[np.arange(A.shape[0])[:,None],B]
Out[42]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
In [43]: m,n = A.shape
In [44]: np.take(A,B + n*np.arange(m)[:,None])
Out[44]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
More recent versions have added a take_along_axis function that does the job:
A = np.array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
B = np.array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
np.take_along_axis(A, B, 1)
Out[]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
There's also a put_along_axis.
I know this is an old question, but another way of doing it using indices is:
A[np.indices(B.shape)[0], B]
output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]
Following is the solution using for loop:
outlist = []
for i in range(len(B)):
lst = []
for j in range(len(B[i])):
lst.append(A[i][B[i][j]])
outlist.append(lst)
outarray = np.asarray(outlist)
print(outarray)
Above can also be written in more succinct list comprehension form:
outlist = [ [A[i][B[i][j]] for j in range(len(B[i]))]
for i in range(len(B)) ]
outarray = np.asarray(outlist)
print(outarray)
Output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]
I'd like to turn an open mesh returned by the numpy ix_ routine to a list of coordinates
eg, for:
In[1]: m = np.ix_([0, 2, 4], [1, 3])
In[2]: m
Out[2]:
(array([[0],
[2],
[4]]), array([[1, 3]]))
What I would like is:
([0, 1], [0, 3], [2, 1], [2, 3], [4, 1], [4, 3])
I'm pretty sure I could hack it together with some iterating, unpacking and zipping, but I'm sure there must be a smart numpy way of achieving this...
Approach #1 Use np.meshgrid and then stack -
r,c = np.meshgrid(*m)
out = np.column_stack((r.ravel('F'), c.ravel('F') ))
Approach #2 Alternatively, with np.array() and then transposing, reshaping -
np.array(np.meshgrid(*m)).T.reshape(-1,len(m))
For a generic case with for generic number of arrays used within np.ix_, here are the modifications needed -
p = np.r_[2:0:-1,3:len(m)+1,0]
out = np.array(np.meshgrid(*m)).transpose(p).reshape(-1,len(m))
Sample runs -
Two arrays case :
In [376]: m = np.ix_([0, 2, 4], [1, 3])
In [377]: p = np.r_[2:0:-1,3:len(m)+1,0]
In [378]: np.array(np.meshgrid(*m)).transpose(p).reshape(-1,len(m))
Out[378]:
array([[0, 1],
[0, 3],
[2, 1],
[2, 3],
[4, 1],
[4, 3]])
Three arrays case :
In [379]: m = np.ix_([0, 2, 4], [1, 3],[6,5,9])
In [380]: p = np.r_[2:0:-1,3:len(m)+1,0]
In [381]: np.array(np.meshgrid(*m)).transpose(p).reshape(-1,len(m))
Out[381]:
array([[0, 1, 6],
[0, 1, 5],
[0, 1, 9],
[0, 3, 6],
[0, 3, 5],
[0, 3, 9],
[2, 1, 6],
[2, 1, 5],
[2, 1, 9],
[2, 3, 6],
[2, 3, 5],
[2, 3, 9],
[4, 1, 6],
[4, 1, 5],
[4, 1, 9],
[4, 3, 6],
[4, 3, 5],
[4, 3, 9]])
I have a numpy array say
a = array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
I have an array 'replication' of the same size where replication[i,j](>=0) denotes how many times a[i][j] should be repeated along the row. Obiviously, replication array follows the invariant that np.sum(replication[i]) have the same value for all i.
For example, if
replication = array([[1, 2, 1],
[1, 1, 2],
[2, 1, 1]])
then the final array after replicating is:
new_a = array([[1, 2, 2, 3],
[4, 5, 6, 6],
[7, 7, 8, 9]])
Presently, I am doing this to create new_a:
##allocate new_a
h = a.shape[0]
w = a.shape[1]
for row in range(h):
ll = [[a[row][j]]*replicate[row][j] for j in range(w)]
new_a[row] = np.array([item for sublist in ll for item in sublist])
However, this seems to be too slow as it involves using lists. Can I do the intended entirely in numpy, without the use of python lists?
You can flatten out your replication array, then use the .repeat() method of a:
import numpy as np
a = array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
replication = array([[1, 2, 1],
[1, 1, 2],
[2, 1, 1]])
new_a = a.repeat(replication.ravel()).reshape(a.shape[0], -1)
print(repr(new_a))
# array([[1, 2, 2, 3],
# [4, 5, 6, 6],
# [7, 7, 8, 9]])