I can first obtain the DFT matrix of a given size, say n by
import numpy as np
n = 64
D = np.fft.fft(np.eye(n))
The FFT is of course just a quick algorithm for applying D to a vector:
x = np.random.randn(n)
ft1 = np.dot(D,x)
print( np.abs(ft1 - fft.fft(x)).max() )
# prints near double precision roundoff
The 2D FFT can be obtained by applying D to both the rows and columns of a matrix:
x = np.random.randn(n,n)
ft2 = np.dot(x, D.T) # Apply D to rows.
ft2 = np.dot(D, ft2) # Apply D to cols.
print( np.abs(ft2 - fft.fft2(x)).max() )
# near machine round off again
How do I compute this analogously for the 3 dimensional Discrete Fourier Transform?
I.e.,
x = np.random.randn(n,n,n)
ft3 = # dot operations using D and x
print( np.abs(ft3 - fft.fftn(x)).max() )
# prints near zero
Essentially, I think I need to apply D to each column vector in the volume, then each row vector in the volume, and finally each "depth vector". But I'm not sure how to do this using dot.
You can use the einsum expression to perform the transformation on each index:
x = np.random.randn(n, n, n)
ft3 = np.einsum('ijk,im->mjk', x, D)
ft3 = np.einsum('ijk,jm->imk', ft3, D)
ft3 = np.einsum('ijk,km->ijm', ft3, D)
print(np.abs(ft3 - np.fft.fftn(x)).max())
1.25571216554e-12
This can also be written as a single NumPy step:
ft3 = np.einsum('ijk,im,jn,kl->mnl', ft3, D, D, D, optimize=True)
Without the optimize argument (available in NumPy 1.12+) it will be very slow however. You can also do each of the steps using dot, but it requires a bit of reshaping and transposing. In NumPy 1.14+ the einsum function will automatically detect the BLAS operations and do this for you.
Related
I am currently trying to work through the MIT Deep Learning and Computer Vision course (EECS 498-007 / 598-005) Assignment 1 by myself, which seems to have a rough equivalent in Stanford CS 231n.
Problem-formulation:
Create a function, which computes the pairwise euclidean distance inputs: xtrain,xtest. Dimensions: [N,x,x] and [M,x,x] (with x being the same number)
output: distance-matrix of shape [N,M] expressing the distance between each training point and each testing point.
There is given a hint in the assignment:
Try to formulate the Euclidean distance using two broadcast sums and a matrix multiply.
I am trying to implement this mathematical operation using broadcasting, where the middle term is a simple matrix-multiplication
I am struggling with the tensor-shapes. My implementation so far is as follows:
def euc_no_loop(x,y):
#hint: two broadcast sums
xsq = torch.sum(x**2,axis=1)
print(xsq.shape)
ysq = torch.sum(y**2,axis=1)
print(ysq.shape)
#and one matrix multiply
mixprod = -2 * x.view(x.shape[0],-1).matmul(y.view(y.shape[0],-1).T)
print(mixprod.shape)
euc_dist = torch.sqrt(xsq + mixprod + ysq.unsqueeze(1).T)
return euc_dist
With inputs being:
x = torch.randn(5,3,3)
y = torch.randn(3,3,3)
shapes become:
xsq: [5,3]
ysq: [3,3]
mixprod: [5,3]
And output dimension becomes [3,5,3].
Many other StackOverflow threads exist, where numpy is used - but the numpy dot-product seems to be more flexible than torch.matmul.
Example on numpy-solution: Compute L2 distance with numpy using matrix multiplication
I simply don't understand where I am going wrong.
Input tensors probably should have two dimensions in order to compute pairwise distance. So I assume that those x by x matrices should be summed altogether like (N, x, x) => (N)
def euc_no_loop(x, y):
# Suppose x has (N, x, x) and Y has (M, x, x) dimensions
xsq = torch.sum(x**2, dim=(1, 2)) # (N,)
print(xsq.shape)
ysq = torch.sum(y**2, dim=(1, 2)) # (M,)
print(ysq.shape)
mixprod = -2 * x.view(x.shape[0], -1) # y.view(y.shape[0], -1).T # (N, M)
print(mixprod.shape)
euc_dist = torch.sqrt(xsq.unsqueeze(1) + mixprod + ysq.unsqueeze(0)) # (N,1)+(N,M)+(1,M) => (N, M)
return euc_dist
Or just flatten the input tensors
x = x.flatten(start_dim=1)
y = y.flatten(start_dim=1)
I have the code below for fft2 performed by numpy and a 2d fft performed by direct code. an anyone point out why they are different? My inputmatreix is rA.
def DFT_matrix(N):
i, j = np.meshgrid(np.arange(N), np.arange(N))
omega = np.exp( - 2 * math.pi * 1J / N )
W = np.power( omega, i * j ) / np.sqrt(N)
return W
sizeM=40
rA=np.random.rand(sizeM,sizeM)
rAfft=np.fft.fft2(rA)
rAfftabs=np.abs(rAfft)+1e-9
dftMtx=DFT_matrix(sizeM)
dftR=dftMtx.conj().T
mA=dftMtx*rA*dftR
mAabs=np.abs(mA)+1e-9
print(np.allclose(mAabs, rAfftabs))
There are a few problems with your implementation.
1. DFT Matrix formula
First of all, as explained here, the formula for computing the DFT X of a MxN signal x is defined as:
Since you are computing the DFT for a MxM input, you just need to compute the DFT_Matrix once. Also note that due to the way W is defined, there is no need for conjugation and since W is symmetric and unitary there is no need for any transpose.
2. Matrix multiplication
When it comes to actually multiplying the matrixes together, you have to make sure to use the matrix multiplication operator # instead of the element wise multiplier *
3. DFT_matrix normalization
By default the fft functions don't normalize your output. This means that before comparing the two outputs, you either have to divide the np.fft.fft2 result by sqrt(M*M) = M or you drop the np.sqrt(N) in your DFT_matrix function.
Summary:
Her is your example with the appropriate fixes for a MxN input. At the end, the magnitudes and angles are compared.
import numpy as np
def DFT_matrix(N):
i, j = np.meshgrid(np.arange(N), np.arange(N))
omega = np.exp( - 2 * np.pi * 1j / N )
W = np.power( omega, i * j ) # Normalization by sqrt(N) Not included
return W
sizeM=40
sizeN=20
np.random.seed(0)
rA=np.random.rand(sizeM,sizeN)
rAfft=np.fft.fft2(rA)
dftMtxM=DFT_matrix(sizeM)
dftMtxN=DFT_matrix(sizeN)
# Matrix multiply the 3 matrices together
mA = dftMtxM # rA # dftMtxN
print(np.allclose(np.abs(mA), np.abs(rAfft)))
print(np.allclose(np.angle(mA), np.angle(rAfft)))
Both checks should evaluate to True. However note that the complexity of this algorithm, assuming M=N is N³ while the library's fft2 brings that down to N²log(N)!
What would you do if you had n particles on a plane (with positions (x_n,y_n)), with a certain flux flux_n, and you have to pixelate these particles, so you have to go from (x,y) to (pixel_i, pixel_j) space and you have to sum up the flux of the m particles which fall in to every single pixel? Any suggestions? Thank you!
The are several ways with which you can solve your problem.
Assumptions: your positions have been stored into two numpy array of shape (N, ), i.e. the position x_n (or y_n) for n in [0, N), let's call them x and y. The flux is stored into a numpy array with the same shape, fluxes.
1 - INTENSIVE CASE
Create something that looks like a grid:
#get minimums and maximums position
mins = int(x.min()), int(y.min())
maxs = int(x.max()), int(y.max())
#actually you can also add and subtract 1 or more unit
#in order to have a grid larger than the x, y extremes
#something like mins-=epsilon and maxs += epsilon
#create the grid
xx = np.arange(mins[0], maxs[0])
yy = np.arange(mins[1], maxs[1])
Now you can perform a double for loop, tacking, each time, two consecutive elements of xx and yy, to do this, you can simple take:
x1 = xx[:-1] #excluding the last element
x2 = xx[1:] #excluding the first element
#the same for y:
y1 = yy[:-1] #excluding the last element
y2 = yy[1:] #excluding the first element
fluxes_grid = np.zeros((xx.shape[0], yy.shape[0]))
for i, (x1_i, x2_i) in enumerate(zip(x1, x2)):
for j, (y1_j, y2_j) in enumerate(zip(y1, y2)):
idx = np.where((x>=x1_i) & (x<x2_i) & (y>=y1_j) & (y<y2_j))[0]
fluxes_grid[i,j] = np.sum(fluxes[idx])
At the end of this loop you have a grid whose elements are pixels representing the sum of fluxes.
2 - USING A QUANTIZATION ALGORITHM LIKE K-NN
What happen if you have a lot o points, so many that the loop takes hours?
A faster solution is to use a quantization method, like K Nearest Neighbor, KNN on a rigid grid. There are many way to run a KNN (included already implemented version, e.g. sklearn KNN). But this is vary efficient if you can take advantage of a GPU. For example this my tensorflow (vs 2.1) implementation. After you have defined a squared grid:
_min, maxs = min(mins), max(maxs)
xx = np.arange(_min, _max)
yy = np.arange(_min, _max)
You can build the matrix, grid, and your position matrix, X:
grid = np.column_stack([xx, yy])
X = np.column_stack([x, y])
then you have to define a matrix euclidean pairwise-distance function:
#tf.function
def pairwise_dist(A, B):
# squared norms of each row in A and B
na = tf.reduce_sum(tf.square(A), 1)
nb = tf.reduce_sum(tf.square(B), 1)
# na as a row and nb as a co"lumn vectors
na = tf.reshape(na, [-1, 1])
nb = tf.reshape(nb, [1, -1])
# return pairwise euclidead difference matrix
D = tf.sqrt(tf.maximum(na - 2*tf.matmul(A, B, False, True) + nb, 0.0))
return D
Thus:
#compute the pairwise distances:
D = pairwise_dist(grid, X)
D = D.numpy() #get a numpy matrix from a tf tensor
#D has shape M, N, where M is the number of points in the grid and N the number of positions.
#now take a rank and from this the best K (e.g. 10)
ranks = np.argsort(D, axis=1)[:, :10]
#for each point in the grid you have the nearest ten.
Now you have to take the fluxes corresponding to this 10 positions and sum on them.
I had avoid to further specify this second method, I don't know the dimension of your catalogue, if you have or not a GPU or if you want to use such kind of optimization.
If you want I can improve this explanation, only if you are interested.
Suppose I have two 2D NumPy arrays A and B, I would like to compute the matrix C whose entries are C[i, j] = f(A[i, :], B[:, j]), where f is some function that takes two 1D arrays and returns a number.
For instance, if def f(x, y): return np.sum(x * y) then I would simply have C = np.dot(A, B). However, for a general function f, are there NumPy/SciPy utilities I could exploit that are more efficient than doing a double for-loop?
For example, take def f(x, y): return np.sum(x != y) / len(x), where x and y are not simply 0/1-bit vectors.
Here is a reasonably general approach using broadcasting.
First, reshape your two matrices to be rank-four tensors.
A = A.reshape(A.shape + (1, 1))
B = B.reshape((1, 1) + B.shape)
Second, apply your function element by element without performing any reduction.
C = f(A, B) # e.g. A != B
Having reshaped your matrices allows numpy to broadcast. The resulting tensor C has shape A.shape + B.shape.
Third, apply any desired reduction by, for example, summing over the indices you want to discard:
C = C.sum(axis=(1, 3)) / C.shape[0]
I need to efficiently calculate the euclidean weighted distances for every x,y point in a given array to every other x,y point in another array. This is the code I have which works as expected:
import numpy as np
import random
def rand_data(integ):
'''
Function that generates 'integ' random values between [0.,1.)
'''
rand_dat = [random.random() for _ in range(integ)]
return rand_dat
def weighted_dist(indx, x_coo, y_coo):
'''
Function that calculates *weighted* euclidean distances.
'''
dist_point_list = []
# Iterate through every point in array_2.
for indx2, x_coo2 in enumerate(array_2[0]):
y_coo2 = array_2[1][indx2]
# Weighted distance in x.
x_dist_weight = (x_coo-x_coo2)/w_data[0][indx]
# Weighted distance in y.
y_dist_weight = (y_coo-y_coo2)/w_data[1][indx]
# Weighted distance between point from array_1 passed and this point
# from array_2.
dist = np.sqrt(x_dist_weight**2 + y_dist_weight**2)
# Append weighted distance value to list.
dist_point_list.append(round(dist, 8))
return dist_point_list
# Generate random x,y data points.
array_1 = np.array([rand_data(10), rand_data(10)], dtype=float)
# Generate weights for each x,y coord for points in array_1.
w_data = np.array([rand_data(10), rand_data(10)], dtype=float)
# Generate second larger array.
array_2 = np.array([rand_data(100), rand_data(100)], dtype=float)
# Obtain *weighted* distances for every point in array_1 to every point in array_2.
dist = []
# Iterate through every point in array_1.
for indx, x_coo in enumerate(array_1[0]):
y_coo = array_1[1][indx]
# Call function to get weighted distances for this point to every point in
# array_2.
dist.append(weighted_dist(indx, x_coo, y_coo))
The final list dist holds as many sub-lists as points are in the first array with as many elements in each as points are in the second one (the weighted distances).
I'd like to know if there's a way to make this code more efficient, perhaps using the cdist function, because this process becomes quite expensive when the arrays have lots of elements (which in my case they have) and when I have to check the distances for lots of arrays (which I also have)
#Evan and #Martinis Group are on the right track - to expand on Evan's answer, here's a function that uses broadcasting to quickly calculate the n-dimensional weighted euclidean distance without Python loops:
import numpy as np
def fast_wdist(A, B, W):
"""
Compute the weighted euclidean distance between two arrays of points:
D{i,j} =
sqrt( ((A{0,i}-B{0,j})/W{0,i})^2 + ... + ((A{k,i}-B{k,j})/W{k,i})^2 )
inputs:
A is an (k, m) array of coordinates
B is an (k, n) array of coordinates
W is an (k, m) array of weights
returns:
D is an (m, n) array of weighted euclidean distances
"""
# compute the differences and apply the weights in one go using
# broadcasting jujitsu. the result is (n, k, m)
wdiff = (A[np.newaxis,...] - B[np.newaxis,...].T) / W[np.newaxis,...]
# square and sum over the second axis, take the sqrt and transpose. the
# result is an (m, n) array of weighted euclidean distances
D = np.sqrt((wdiff*wdiff).sum(1)).T
return D
To check that this works OK, we'll compare it to a slower version that uses nested Python loops:
def slow_wdist(A, B, W):
k,m = A.shape
_,n = B.shape
D = np.zeros((m, n))
for ii in xrange(m):
for jj in xrange(n):
wdiff = (A[:,ii] - B[:,jj]) / W[:,ii]
D[ii,jj] = np.sqrt((wdiff**2).sum())
return D
First, let's make sure that the two functions give the same answer:
# make some random points and weights
def setup(k=2, m=100, n=300):
return np.random.randn(k,m), np.random.randn(k,n),np.random.randn(k,m)
a, b, w = setup()
d0 = slow_wdist(a, b, w)
d1 = fast_wdist(a, b, w)
print np.allclose(d0, d1)
# True
Needless to say, the version that uses broadcasting rather than Python loops is several orders of magnitude faster:
%%timeit a, b, w = setup()
slow_wdist(a, b, w)
# 1 loops, best of 3: 647 ms per loop
%%timeit a, b, w = setup()
fast_wdist(a, b, w)
# 1000 loops, best of 3: 620 us per loop
You could use cdist if you don't need weighted distances. If you need weighted distances and performance, create an array of the appropriate output size, and use either an automated accelerator like Numba or Parakeet, or hand-tune the code with Cython.
You can avoid looping by using code that looks like the following:
def compute_distances(A, B, W):
Ax = A[:,0].reshape(1, A.shape[0])
Bx = B[:,0].reshape(A.shape[0], 1)
dx = Bx-Ax
# Same for dy
dist = np.sqrt(dx**2 + dy**2) * W
return dist
That will run a lot faster in python that anything that loops as long as you have enough memory for the arrays.
You could try removing the square root, since if a>b, it follows that a squared > b squared... and computers are REALLY slow at square roots normally.