Numpy: sigma notion over two 2D arrays - python

I have the following 2D Numpy arrays that can take an arbitrary shape (d, c), with d and c being equal to each other :
import numpy as np
arr_a = np.array([[1, 2, 3], [4, 5, 6]])
arr_b = np.array([[10, 20, 30], [40, 50, 60]])
I want to solve for t (a float) by taking the summation of these 2D arrays as follows:
I believe np.add(arr_a, arr_b) returns a matrix, so that wouldn't apply here.
Are there any native Numpy functions that can do this? Or, would I have to iterate through arr_a and arr_b?
And, if iteration is required here, how would I apply the summation logic above to the iteration?
(part of my confusion is understanding the i-th and j-th elements)
Thanks in advance for your help!

If A and B are square matrices, equation in the picture just translates to 2*np.sum(a)
import numpy as np
A = np.random.rand(100,100) ##
B = np.random.rand(100,100) ##
your_required_answer = 2*np.sum(A) ## np.sum(A-B) + np.sum(A.T+B.T)

Related

How can I vectorize approximate matrix multiplication using sum of outer products?

Assume I have two matrices, A and B, and I want to compute C = AB using the sum of outer products.
I have written this function to achieve that, but I am wondering If can eliminate the for loop and vectorize it,
import numpy as np
def mul_opx(A, B, pd):
# Approx. matrix multiplication using outer product
n, m = A.shape
p = B.shape[1]
C = np.zeros((n,p), dtype=A.dtype)
dum = np.zeros_like(C)
for t in range(m):
dum = np.outer(A[:,t],B[t,:]) / pd[t]
C = C + dum
C = C / m
return C
d = 1000
A = np.arange(d**2).reshape((d,d))
B = np.arange(d**2).reshape((d,d))
# Full Matrix Multiplication
C = A # B
# Approximate Matrix Multiplication
# choosing half random vectors/rows from A/B
k = np.random.choice(d, int(d/2))
Ap = A[:,k]
Bp = B[k,:]
# Unifrom probability vector
pd_uniform = np.full(d,1/d)
# Approximate product
C_hat = mul_opx(Ap,Bp, pd_uniform[k])
This type of product is useful when matrix dimensions are very large say 10^6 x 10^6
As others have mentioned this could be a good use case for einsum.
Writing your operation in that language can be done with
np.einsum( 'ij,ik->jk',A,B)
Repeated i index for the sum, and unrepeated j k for the outer product. A quick benchmark seems to show a 2x speedup compared to #Tomer's proposed answer. This will depend on the input size of course and I leave to you to see how it generalizes to linear sizes in the 10^6 range, the memory footprint should also be better with the einsum.
You can try using np.multiply.outer(A,B).
Assume you have the following data:
a = np.array([[4, 2],
[2, 2]])
b = np.array([[4, 3],
[4, 0]])
You want to do the following two multiplications:
np.outer(a[:,0],b[0,:]) = array([[16, 12],
[ 8, 6]])
np.outer(a[:,1],b[1,:]) = array([[8, 0],
[8, 0]])
This can be done using the mp.multiply.outer method. The np.multiply.outer "Apply the ufunc op to all pairs (a, b) with a in A and b in B." (See description here). This function perform all possible outer products between A and B, which in this simple example results in a (2,2,2,2) shape matrix. Obviously, you do not need all possible outer product, you just need to extract the one you want from this matrix.
You can see that:
np.multiply.outer(a,b)[0,:,0,:] = array([[16, 12],
[ 8, 6]])
np.multiply.outer(a,b)[1,:,1,:] = array([[8, 0],
[8, 0]])
Using this method you do not need to do the for loop, but you execute redundant computations. However, the numpy package is optimized, and perhaps this will be faster (for very large A and B you can try speeding things up using the jit decorator
Another method in which you do not compute redundant computations is via using np.newaxis to expand the matrices before multiplication.
Using the same a, b from above, perform the following:
a[:,:,np.newaxis] = array([[[4],
[2]],
[[2],
[2]]])
b[:,np.newaxis,:] = array([[[4, 3]],
[[4, 0]]])
Now you can simply do multiplication to receive:
a[:,:,np.newaxis]*b[:,np.newaxis,:] = array([[[16, 12],
[ 8, 6]],
[[ 8, 0],
[ 8, 0]]])
This is the exact result of the outer products without the redundant computation of np.multiply.outer. All that is left is to sum over the 1st dimension as follows:
results = np.sum(a[:,:,np.newaxis]*b[:,np.newaxis,:], axis=0)
Continuing with the second example, exapnding the example to divide each outer product by a different number can be done as follows:
Assuming the vector pd consists of to numbers (since there fore two outer products), the change needed can now simply be done using::
pd = np.array([[[6]],[[8]]]) # shape is (2,1,1) because there are two outer products
solution = np.sum((a[:,:,np.newaxis] * b[:,np.newaxis,:]) / pd, axis=0)
Another method will be to set pd as a (1,2) shaped array, and divide a prior to multiplication:
pd = np.array([[6,8]]) # shape (1,2) beacuse there are two outer products
solution = np.sum((a / pd)[:,:,np.newaxis]* b[:,np.newaxis,:], axis=0)
Regaurding the Einstein summation proposed in the other solution, you can go with:
pd = np.array([[6,8]]) # shape (1,2) beacuse there are two outer products
solution = np.einsum( 'ij,ik->jk',a/pd,b)

Add lists of numpy arrays element-wise

I've been working on an algorithm for backpropagation in neural networks. My program calculates the partial derivative of each weight with respect to the loss function, and stores it in an array. The weights at each layer are stored in a single 2d numpy array, and so the partial derivatives are stored as an array of numpy arrays, where each numpy array has a different size depending on the number of neurons in each layer.
When I want to average the array of partial derivatives after a number of training data has been used, I want to add each array together and divide by the number of arrays. Currently, I just iterate through each array and add each element together, but is there a quicker way? I could use ndarray with dtype=object but apparently, this has been deprecated.
For example, if I have the arrays:
arr1 = [ndarray([[1,1],[1,1],[1,1]]), ndarray([[2,2],[2,2]])]
arr2 = [ndarray([[3,3],[3,3],[3,3]]), ndarray([[4,4],[4,4]])]
How can I add these together to get the array:
arr3 = [ndarray([[4,4],[4,4],[4,4]]), ndarray([[6,6],[6,6]])]
You don't need to add the numbers in the array element-wise, make use of numpy's parallel computations by using numpy.add
Here's some code to do just that:
import numpy as np
arr1 = np.asarray([[[1,1],[1,1],[1,1]], [[2,2],[2,2]]])
arr2 = np.asarray([[[3,3],[3,3],[3,3]], [[4,4],[6,6]]])
ans = []
for first, second in zip(arr1, arr2):
ans.append(np.add(first,second))
Outputs:
>>> [array([[4, 4], [4, 4], [4, 4]]), array([[6, 6], [8, 8]])]
P.S
Could use a one-liner list-comprehension as well
ans = [np.add(first, second) for first, second in zip(arr1, arr2)]
You can use zip/map/sum:
import numpy as np
arr1 = [np.array([[1,1],[1,1],[1,1]]), np.array([[2,2],[2,2]])]
arr2 = [np.array([[3,3],[3,3],[3,3]]), np.array([[4,4],[4,4]])]
arr3 = list(map(sum, zip(arr1, arr2)))
output:
>>> arr3
[array([[4, 4],
[4, 4],
[4, 4]]),
array([[6, 6],
[6, 6]])]
In NumPy, you can add two arrays element-wise by adding two NumPy arrays.
N.B: if your array shape varies then reshape the array and fill with 0.
arr1 = np.array([np.array([[1,1],[1,1],[1,1]]), np.array([[2,2],[2,2]])])
arr2 = np.array([np.array([[3,3],[3,3],[3,3]]), np.array([[4,4],[4,4]])])
arr3 = arr2 + arr1
You can use a list comprehension:
[x + y for x, y in zip(arr1, arr2)]

How can I multiply numpy matrix elementwise without for loops?

I would like to apply the same matrix (3x3) to a large list of points that are contained in a vector. The vector is of the form (40000 x 3). The below code does the job but it is too slow. Are there any numpy tricks I can use to eliminate the for loop and append function?
def apply_matrix_to_shape(Matrix,Points):
"""input a desired transformation and an array of points that are in
the format np.array([[x1,y1,z1],[x2,y2,z2],...,]]). will output
a new array of translated points with the same format"""
New_shape = np.array([])
M = Matrix
for p in Points:
New_shape = np.append(New_shape,[p[0]*M[0][0]+p[1]*M[0][1]+p[2]*M[0][2],
p[0]*M[1][0]+p[1]*M[1][1]+p[2]*M[1][2],
p[0]*M[2][0]+p[1]*M[2][1]+p[2]*M[2][2]])
Rows = int(len(New_shape) / 3)
return np.reshape(New_shape,(Rows,3))
You basically want the matrix multiplication of both arrays (not an element-wise one). You just need to tranpose so the shapes are aligned, and transpose back the result:
m.dot(p.T).T
Or equivalently:
(m#p.T).T
m = np.random.random((3,3))
p = np.random.random((15,3))
np.allclose((m#p.T).T, apply_matrix_to_shape(m, p))
# True
Indeed, I think what you want is one of the main reason why NumPy came to live. You can use the dot product function and the transpose function (simply .T or .transpose())
import numpy as np
points = np.array([[1, 2, 3],
[4, 5, 6]])
T_matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
result = points.dot(T_matrix.T)
print(result)
>>> [[ 14 32 50]
[ 32 77 122]]

Applying matrix functions like scipy.linalg.eigh to higher dimensional arrays

I am new to numpy but have been using python for quite a while as an engineer.
I am writing a program that currently stores stress tensors as 3x3 numpy arrays within another NxM array which represents values through time and through the thickness of a wall, so overall it is an NxMx3x3 numpy array. I want to efficiently calculate the eigenvals and vectors of each 3x3 array within this larger array. So far I have tried to using "fromiter" but this doesn't seem to work because the functions returns 2 arrays. I have also tried apply_along_axis which also doesn't work because it says the inner 3x3 is not a square matrix? I can do it with list comprehension, but this doesn't seem ideal to resort to using lists.
Example just calculating eigenvals using list comprehension
import numpy as np
from scipy import linalg
a=np.random.random((2,2,3,3))
f=linalg.eigvalsh
ans=np.asarray([f(x) for x in a.reshape((4,3,3))])
ans.shape=(2,2,3)
I thought something like this would work but I have played around with it and can't get it working:
np.apply_along_axis(f,0,a)
BTW the 2x2 bit could be up to 5000x100 and this code is repeated ~50x50x200 times hence the need for efficiency. Any help would be greatly appreciated?
You can use numpy.linalg.eigh. It accepts an array like your example a.
Here's an example. First, create an array of 3x3 symmetric arrays:
In [96]: a = np.random.random((2, 2, 3, 3))
In [97]: a = a + np.transpose(a, axes=(0, 1, 3, 2))
In [98]: a[0, 0]
Out[98]:
array([[0.61145048, 0.85209618, 0.03909677],
[0.85209618, 1.79309413, 1.61209077],
[0.03909677, 1.61209077, 1.55432465]])
Compute the eigenvalues and eigenvectors of all the 3x3 arrays:
In [99]: evals, evecs = np.linalg.eigh(a)
In [100]: evals.shape
Out[100]: (2, 2, 3)
In [101]: evecs.shape
Out[101]: (2, 2, 3, 3)
Take a look at the result for a[0, 0]:
In [102]: evals[0, 0]
Out[102]: array([-0.31729364, 0.83148477, 3.44467813])
In [103]: evecs[0, 0]
Out[103]:
array([[-0.55911658, 0.79634401, 0.23070516],
[ 0.63392772, 0.23128064, 0.73800062],
[-0.53434473, -0.55887877, 0.63413738]])
Verify that it is the same as computing the eigenvalues and eigenvectors for a[0, 0] separately:
In [104]: np.linalg.eigh(a[0, 0])
Out[104]:
(array([-0.31729364, 0.83148477, 3.44467813]),
array([[-0.55911658, 0.79634401, 0.23070516],
[ 0.63392772, 0.23128064, 0.73800062],
[-0.53434473, -0.55887877, 0.63413738]]))

Python, creating a large-dimensional matrix of 3-dimensional dot products

This is my goal, using Python Numpy:
I would like to create a (1000,1000) dimensional array/matrix of dot product values. That means each array/matrix entry is the dot product of vectors 1 through 1000. Constructing this is theoretically simple: one defines a (1,1000) dimensional matrix of vectors v1, v2, ..., v1000
import numpy as np
vectorvalue = np.matrix([v1, v2, v3, ..., v1000])
and takes the dot product with the transpose, i.e.
matrix_of_dotproducts = np.tensordot(vectorvalue.T, vectorvalue)
And the shape of the array/matrix will be (1000, 1000). The (1,1) entry will be the dot product of vectors (v1,v1), the (1,2) entry will be the dot product of vectors (v1,v2), etc. In order to calculate the dot product with numpy for a three-dimensional vector, it's wise to use numpy.tensordot() instead of numpy.dot()
Here's my problem: I'm not beginning with an array of vector values. I'm beginning with three 1000 element arrays of each coordinate values, i.e. an array of x-coordinates, y-coordinates, and z-coordinates.
xvalues = np.array([x1, x2, x3, ..., x1000])
yvalues = np.array([y1, y2, y3, ..., y1000])
zvalues = np.array([z1, z2, z3, ..., z1000])
Is the easiest thing to do to construct a (3, 1000) numpy array/matrix and then take the tensor dot product for each pair?
v1 = np.array([x1,y1,z1])
v2 = np.array([x2,y2,z2])
...
I'm sure there's a more tractable and efficient way to do this...
PS: To be clear, I would like to take a 3D dot product. That is, for vectors
A = (a1, a2, a3)
and B = (b1, b2, b3),
the dot product should be
dotproduct(A,B) = a1b1 + a2b2 + a3b3.
IIUC, you can build the intermediate array as you suggested:
>>> arr = np.vstack([xvalues, yvalues, zvalues]).T
>>> out = arr.dot(arr.T)
Which seems to be what you want:
>>> out.shape
(1000, 1000)
>>> out[3,4]
1.193097281209083
>>> arr[3].dot(arr[4])
1.193097281209083
So, you're not far off with your initial thought. There's very little overhead involved in concatenating the arrays, but if you're interested in doing in within numpy, there's a built-in set of functions, vstack, hstack, and dstack that should perform exactly as you wish. (Vertical, horizontal, and depth respectively)
I'll leave it up to you to determine which to you where, but here's an example shamelessly stolen from the docs to help get you started:
>>> a = np.array([1, 2, 3])
>>> b = np.array([2, 3, 4])
>>> np.vstack((a,b))
array([[1, 2, 3],
[2, 3, 4]])
For reference: vstack docs, hstack docs, and dstack docs
If it feels a little over-the-top to have three separate functions here then you're right! That's why numpy also has the concatenate function. It's just a generalization of vstack, hstack, and dstack that takes an axis argument.
>>> a = np.array([[1, 2], [3, 4]])
>>> b = np.array([[5, 6]])
>>> np.concatenate((a, b), axis=0)
array([[1, 2],
[3, 4],
[5, 6]])
Concatenate docs

Categories

Resources