I've got an array which contains a bunch of points (3D vectors, specifically):
pts = np.array([
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5],
])
And I would like to multiply each one of those points by a transformation matrix:
pts[0] = np.dot(transform_matrix, pts[0])
pts[1] = np.dot(transform_matrix, pts[1])
…
pts[n] = np.dot(transform_matrix, pts[n])
How can I do this efficiently?
I find it helps to write the einsum version first-- after you see the indices you can often recognize that there's a simpler version. For example, starting from
>>> pts = np.random.random((5,3))
>>> transform_matrix = np.random.random((3,3))
>>>
>>> pts_brute = pts.copy()
>>> for i in range(len(pts_brute)):
... pts_brute[i] = transform_matrix.dot(pts_brute[i])
...
>>> pts_einsum = np.einsum("ij,kj->ik", pts, transform_matrix)
>>> np.allclose(pts_brute, pts_einsum)
True
you can see this is simply
>>> pts_dot = pts.dot(transform_matrix.T)
>>> np.allclose(pts_brute, pts_dot)
True
Matrix-matrix multiplication can be thought of as "batch-mode" matrix-vector multiplication, where each column in the second matrix is one of the vectors being multiplied by the first, with the result vectors being the columns of the resulting matrix.
Also note that since (AB)T = BTAT, and therefore (by transposing both sides) ((AB)T)T = AB = (BTAT)T you can make a similar statement about the rows of the first matrix being batch-(left-)multiplied by the transpose of the second matrix, with the result vectors being the rows of the matrix product.
Related
I am trying to get the dotproduct of two arrays in python using the numpy package. I get as output an array of size (n,). It says that my array has no column while I do see the results when I print it. Why does my array have no column and how do I fix this?
My goal is to calculate y - np.dot(x,b). The issue is that y is (124, 1) while np.dot(x,b) is (124,)
Thanks
It seems that you are trying to subtract two arrays of a different shape. Fortunately, it is off by a single additional axis, so there are two ways of handling it.
(1) You slice the y array to match the shape of the dot(x,b) array:
y = y[:,0]
print(y-np.dot(x,b))
(2) You add an additional axis on the np.dot(x,b) array:
dot = np.dot(x,b)
dot = dot[:,None]
print(y-dot)
Hope this helps
it may depends on the dimension of your array
For example :
a = [1, 0]
b = [[4, 1], [2, 2]]
c = np.dot(a,b)
gives
array([4, 1])
and its shape is (2,)
but if you change a like :
a = [[1, 0],[1,1]]
then result is :
array([[4, 1],
[6, 3]])
and its shape is (2,2)
Assume I have two matrices, A and B, and I want to compute C = AB using the sum of outer products.
I have written this function to achieve that, but I am wondering If can eliminate the for loop and vectorize it,
import numpy as np
def mul_opx(A, B, pd):
# Approx. matrix multiplication using outer product
n, m = A.shape
p = B.shape[1]
C = np.zeros((n,p), dtype=A.dtype)
dum = np.zeros_like(C)
for t in range(m):
dum = np.outer(A[:,t],B[t,:]) / pd[t]
C = C + dum
C = C / m
return C
d = 1000
A = np.arange(d**2).reshape((d,d))
B = np.arange(d**2).reshape((d,d))
# Full Matrix Multiplication
C = A # B
# Approximate Matrix Multiplication
# choosing half random vectors/rows from A/B
k = np.random.choice(d, int(d/2))
Ap = A[:,k]
Bp = B[k,:]
# Unifrom probability vector
pd_uniform = np.full(d,1/d)
# Approximate product
C_hat = mul_opx(Ap,Bp, pd_uniform[k])
This type of product is useful when matrix dimensions are very large say 10^6 x 10^6
As others have mentioned this could be a good use case for einsum.
Writing your operation in that language can be done with
np.einsum( 'ij,ik->jk',A,B)
Repeated i index for the sum, and unrepeated j k for the outer product. A quick benchmark seems to show a 2x speedup compared to #Tomer's proposed answer. This will depend on the input size of course and I leave to you to see how it generalizes to linear sizes in the 10^6 range, the memory footprint should also be better with the einsum.
You can try using np.multiply.outer(A,B).
Assume you have the following data:
a = np.array([[4, 2],
[2, 2]])
b = np.array([[4, 3],
[4, 0]])
You want to do the following two multiplications:
np.outer(a[:,0],b[0,:]) = array([[16, 12],
[ 8, 6]])
np.outer(a[:,1],b[1,:]) = array([[8, 0],
[8, 0]])
This can be done using the mp.multiply.outer method. The np.multiply.outer "Apply the ufunc op to all pairs (a, b) with a in A and b in B." (See description here). This function perform all possible outer products between A and B, which in this simple example results in a (2,2,2,2) shape matrix. Obviously, you do not need all possible outer product, you just need to extract the one you want from this matrix.
You can see that:
np.multiply.outer(a,b)[0,:,0,:] = array([[16, 12],
[ 8, 6]])
np.multiply.outer(a,b)[1,:,1,:] = array([[8, 0],
[8, 0]])
Using this method you do not need to do the for loop, but you execute redundant computations. However, the numpy package is optimized, and perhaps this will be faster (for very large A and B you can try speeding things up using the jit decorator
Another method in which you do not compute redundant computations is via using np.newaxis to expand the matrices before multiplication.
Using the same a, b from above, perform the following:
a[:,:,np.newaxis] = array([[[4],
[2]],
[[2],
[2]]])
b[:,np.newaxis,:] = array([[[4, 3]],
[[4, 0]]])
Now you can simply do multiplication to receive:
a[:,:,np.newaxis]*b[:,np.newaxis,:] = array([[[16, 12],
[ 8, 6]],
[[ 8, 0],
[ 8, 0]]])
This is the exact result of the outer products without the redundant computation of np.multiply.outer. All that is left is to sum over the 1st dimension as follows:
results = np.sum(a[:,:,np.newaxis]*b[:,np.newaxis,:], axis=0)
Continuing with the second example, exapnding the example to divide each outer product by a different number can be done as follows:
Assuming the vector pd consists of to numbers (since there fore two outer products), the change needed can now simply be done using::
pd = np.array([[[6]],[[8]]]) # shape is (2,1,1) because there are two outer products
solution = np.sum((a[:,:,np.newaxis] * b[:,np.newaxis,:]) / pd, axis=0)
Another method will be to set pd as a (1,2) shaped array, and divide a prior to multiplication:
pd = np.array([[6,8]]) # shape (1,2) beacuse there are two outer products
solution = np.sum((a / pd)[:,:,np.newaxis]* b[:,np.newaxis,:], axis=0)
Regaurding the Einstein summation proposed in the other solution, you can go with:
pd = np.array([[6,8]]) # shape (1,2) beacuse there are two outer products
solution = np.einsum( 'ij,ik->jk',a/pd,b)
I would like to apply the same matrix (3x3) to a large list of points that are contained in a vector. The vector is of the form (40000 x 3). The below code does the job but it is too slow. Are there any numpy tricks I can use to eliminate the for loop and append function?
def apply_matrix_to_shape(Matrix,Points):
"""input a desired transformation and an array of points that are in
the format np.array([[x1,y1,z1],[x2,y2,z2],...,]]). will output
a new array of translated points with the same format"""
New_shape = np.array([])
M = Matrix
for p in Points:
New_shape = np.append(New_shape,[p[0]*M[0][0]+p[1]*M[0][1]+p[2]*M[0][2],
p[0]*M[1][0]+p[1]*M[1][1]+p[2]*M[1][2],
p[0]*M[2][0]+p[1]*M[2][1]+p[2]*M[2][2]])
Rows = int(len(New_shape) / 3)
return np.reshape(New_shape,(Rows,3))
You basically want the matrix multiplication of both arrays (not an element-wise one). You just need to tranpose so the shapes are aligned, and transpose back the result:
m.dot(p.T).T
Or equivalently:
(m#p.T).T
m = np.random.random((3,3))
p = np.random.random((15,3))
np.allclose((m#p.T).T, apply_matrix_to_shape(m, p))
# True
Indeed, I think what you want is one of the main reason why NumPy came to live. You can use the dot product function and the transpose function (simply .T or .transpose())
import numpy as np
points = np.array([[1, 2, 3],
[4, 5, 6]])
T_matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
result = points.dot(T_matrix.T)
print(result)
>>> [[ 14 32 50]
[ 32 77 122]]
It may be a stupid question but I couldn't find a similar question asked(for now).
For example, I define as function called f(x,y)
def f(x, y):
return x+y
Now I want to output a 2D numpy array, the value of an element is equal to its indices summed, for example, if I want a 2x2 array:
arr = [[0, 1],
[1, 2]]
If I want a 3x3 array, then the output should be:
arr = [[0, 1, 2],
[1, 2, 3],
[2, 3, 4]]
It's not efficient to assign the values one by one, especially if the array size is large, say 10000*10000, which is also a waste of the quick speed of numpy. Although it sounds quite basic but I can't think of a simple and quick solution to it. What is the most common and efficient way to do it?
By the way, the summing indices just an example. I hope that the method can also be generalized to arbitrary functions like, say,
def f(x,y):
return np.cos(x)+np.sin(y)
Or even to higher dimensional arrays, like 4x4 arrays.
You can use numpy.indices, which returns an array representing the indices of a grid; you'll just need to sum along the 0 axis:
>>> a = np.random.random((2,2))
>>> np.indices(a.shape).sum(axis=0) # array([[0, 1], [1, 2]])
>>> a = np.random.random((3,3))
>>> np.indices((3,3)).sum(axis=0) #array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])
This is my goal, using Python Numpy:
I would like to create a (1000,1000) dimensional array/matrix of dot product values. That means each array/matrix entry is the dot product of vectors 1 through 1000. Constructing this is theoretically simple: one defines a (1,1000) dimensional matrix of vectors v1, v2, ..., v1000
import numpy as np
vectorvalue = np.matrix([v1, v2, v3, ..., v1000])
and takes the dot product with the transpose, i.e.
matrix_of_dotproducts = np.tensordot(vectorvalue.T, vectorvalue)
And the shape of the array/matrix will be (1000, 1000). The (1,1) entry will be the dot product of vectors (v1,v1), the (1,2) entry will be the dot product of vectors (v1,v2), etc. In order to calculate the dot product with numpy for a three-dimensional vector, it's wise to use numpy.tensordot() instead of numpy.dot()
Here's my problem: I'm not beginning with an array of vector values. I'm beginning with three 1000 element arrays of each coordinate values, i.e. an array of x-coordinates, y-coordinates, and z-coordinates.
xvalues = np.array([x1, x2, x3, ..., x1000])
yvalues = np.array([y1, y2, y3, ..., y1000])
zvalues = np.array([z1, z2, z3, ..., z1000])
Is the easiest thing to do to construct a (3, 1000) numpy array/matrix and then take the tensor dot product for each pair?
v1 = np.array([x1,y1,z1])
v2 = np.array([x2,y2,z2])
...
I'm sure there's a more tractable and efficient way to do this...
PS: To be clear, I would like to take a 3D dot product. That is, for vectors
A = (a1, a2, a3)
and B = (b1, b2, b3),
the dot product should be
dotproduct(A,B) = a1b1 + a2b2 + a3b3.
IIUC, you can build the intermediate array as you suggested:
>>> arr = np.vstack([xvalues, yvalues, zvalues]).T
>>> out = arr.dot(arr.T)
Which seems to be what you want:
>>> out.shape
(1000, 1000)
>>> out[3,4]
1.193097281209083
>>> arr[3].dot(arr[4])
1.193097281209083
So, you're not far off with your initial thought. There's very little overhead involved in concatenating the arrays, but if you're interested in doing in within numpy, there's a built-in set of functions, vstack, hstack, and dstack that should perform exactly as you wish. (Vertical, horizontal, and depth respectively)
I'll leave it up to you to determine which to you where, but here's an example shamelessly stolen from the docs to help get you started:
>>> a = np.array([1, 2, 3])
>>> b = np.array([2, 3, 4])
>>> np.vstack((a,b))
array([[1, 2, 3],
[2, 3, 4]])
For reference: vstack docs, hstack docs, and dstack docs
If it feels a little over-the-top to have three separate functions here then you're right! That's why numpy also has the concatenate function. It's just a generalization of vstack, hstack, and dstack that takes an axis argument.
>>> a = np.array([[1, 2], [3, 4]])
>>> b = np.array([[5, 6]])
>>> np.concatenate((a, b), axis=0)
array([[1, 2],
[3, 4],
[5, 6]])
Concatenate docs