Deterministic Fourier Deconvolution - python

I'm having a little trouble with doing a Fourier Deconvolution using numpy. I'm currently attempting this with a test case of 3 Gaussians so I know exactly what to expect at each end.
What I'm trying to recover is the input signal given the exact form of the filter and the output.
Here, I have used a naive constraint to suppress the high frequency ends setting it to zero (because the signals are all Gaussians in fourier space as well). I expected to recover my original input with a tiny bit of ringing due to this constraint.
#Dummy Case for Gaussian convolve with Gaussian
N = 128
x = np.arange(-5, 5, 10./(2 * N))
epsilon = 1e-18
def gaus(x,sigma):
return 1./np.sqrt(2*np.pi)/sigma * np.exp(-(x * x)/(2 * sigma**2))
y_g = gaus(x,0.3) #output gaussian blurred signal
y_b = gaus(x,0.1) #gaussian blur filter
y_i = gaus(x,np.sqrt(0.3**2 - 0.1**2)) #og gaussian input
f_yg = np.fft.fft(y_g) #fft the blur
f_yb = np.fft.fft(y_b) #fft the filter
f_yi = np.fft.fft(y_i)
r_f = (np.fft.fftshift(f_yg)+epsilon)/(np.fft.fftshift(f_yb)+epsilon) #deconvolve by division in fourier space
r_f[np.abs(x)>0.5] = 0 #naive constraint to remove the artifacts by knowing final form is gaussian
r_f = np.fft.ifftshift(r_f)
r_if = np.fft.ifft(r_f)
y_gf = np.fft.ifft(f_yg)
y_bf = np.fft.ifft(f_yb)
y_if = np.fft.ifft(f_yi)
plt.plot(x,y_if, label='fft true input')
plt.plot(x,r_if, label='fft recv. input')
plt.legend(framealpha=0.)
plt.show()
Here the orange is the recovered input signal using the deconvolution of the output and the blur.
There are a few questions I have with this:
There is clearly a scaling issue. The only area where I can think that this may come in is when I applied the naive constraint. Should I renormalize in this step, knowing that 1/sqrt(N)*integral over my fourier space is equal to 1?
It looks like the position of the recovered Gaussian is messed up with half of the curve at either sides of the plot. Is this due to the division in Fourier space? How do I recover the original position (or have I done this completely wrong to begin with)
I've attached the script used to generate the two curves, the original input and recovered input in physical space.
Cheers,
Keven
EDIT: I should add I have no problem restoring the image using scipy.deconvolve + some small edits. This must mean my method here is somehow wrong?

1 ) As you correctly understood, the requirement for a scaling is related to the Discrete Fourier Transform. The best way to get it is to compute the deconvolution of two uniform unit signals. Their DFT is n 0 0 0 ...., where n is the number of points of the DFT. Hence the ratio r_f is 1 0 0 0 0 and its backward fft computed by np.fft.ifft() is 1/n 1/n 1/n ...
The correct signal resulting from the deconvolution should have been 1/T 1/T 1/T ..., where T=10. is the length of the frame.
As a consequence, the correct scaling to perform the deconvolution is n/T= len(r_f)/10.
r_if=r_if*len(r_if)/10.
2) The deconvoluted signal is translated by half a period. This is due to the fact that the gaussian kernel is centered on the middle of the frame. Simply shift the kernel by half a period and the problem is solved. The function np.fft.fftshift() can be applied to this end:
f_yb = np.fft.fft(np.fft.fftshift(y_b)) #fft the filter
EDIT: To investigate the reason for the translation, let's focus on the case of the deconvolution kernel being a very narrow gaussian distribution, nearly corresponding to Dirac distribution. Your input signal is a gaussian curve, centered at zero, the frame being sampled between -5 and 5. Similarly, the deconvolution kernel is a Dirac centered at zero. As a consequence, the deconvoluted signal must be identical to the input signal: a gaussian curve centered at zero. Nevertheless, the DFT as implemented in FFTW and consequently np.fft.fft() is computed as that of a frame starting at 0 and ending at 10, sampled at points 10j/n where j is in [0..n-1], the frequencies in the Fourier space being k/10 where k in [0..n/2,-n/2+1..-1]. As a consequence, this DFT sees your signal as a gaussian centered at 5 and the deconvolution kernel as a Dirac centered at 5. The convolution of a function f(t) with a Dirac delta(t-t0) centered at t0 is simply the translated function f(t-t0). Hence, the result of the deconvolution as computed by np.fft.fft() is the input signal translated by half a period. Since the input signal is centered at 0 in the [-5,5] frame, the output signal computed by np.fft.fft() is centered at -5 (or equivalently 5 due to periodicity). Shifting the kernel resolves the mismatch between us thinking of the frame as [-5 5] and np.fft.ifft() handling it as if it were [0 10].
Filters are often designed to reduce the effect of high-frequency noises. Deconvoluting therfore induce a potential magnification of high frequency noise. Screening the frequencies as you did is a potential solution. Notice that it is exactly equivalent to convoluting the signal with a particular filter!
In the range of tomographic reconstruction, the filtered backprojection algorithm requires applying a ramp filter, which dramatically inflate the high frequency noise. Here is proposed a Wiener filter: this kind of filter can be designed to minimize the mean square error on the deconvoluted signal, given the SNR of the convoluted signal. It nevertheless require some assumption regarding the power spectral densities of the signal and noise.

Related

Median of derivative in X axis of an image

I computed derivatives using different methods such as :
convolution with an array [[-1, 1]].
Using the fourier theorem by computing DFT of the image and the array mentioned above, multiplying them and performing IDFT.
Directly through the derivative formula (Computing Fourier, multiplying by index and a constant and computing the inverse).
All methods seem to work almost identically, but have slight differences.
An explanation why they end up with slightly different results would be appreciated.
After computing those I started playing with the result to learn about it, and I found out something that confused me:
The main thing that baffles me is that when I try computing the median of this derivative, its ALWAYS 0.0.
Why is that?
I added the code I used to compute this (the first method at least) because maybe I'm doing something wrong.
from scipy.signal import convolve2d
im = sl.read_image(r'C:\Users\ahhal\Desktop\Essentials\Uni\year3\SemesterA\ImageProcessing\Exercises\Ex2\external\monkey.jpg', 1)
b = [[-1, 1]]
print(np.median(convolve2d(im, b)))
output: 0.0
The read_image function is my own and this is the implementation:
from imageio import imread
from skimage.color import rgb2gray
import numpy as np
def read_image(filename, representation):
"""
Receives an image file and converts it into one of two given representations.
:param filename: The file name of an image on disk (could be grayscale or RGB).
:param representation: representation code, either 1 or 2 defining wether the output
should be a grayscale image (1) or an RGB image (2). If the input image is grayscale,
we won't call it with representation = 2.
:return: An image, represented by a matrix of type (np.float64) with intensities
normalized to the range [0,1].
"""
assert representation in [1, 2]
# reads the image
im = imread(filename)
if representation == 1: # If the user specified they need grayscale image,
if len(im.shape) == 3: # AND the image is not grayscale yet
im = rgb2gray(im) # convert to grayscale (**Assuming its RGB and not a different format**)
im_float = im.astype(np.float64) # Convert the image type to one we can work with.
if im_float.max() > 1: # If image values are out of bound, normalize them.
im_float = im_float / 255
return im_float
Edit 2:
I tried it on several different images, and got 0.0 at all of them.
The image I'm using in the example is:
I computed derivatives using different methods such as :
convolution with an array [[-1, 1]].
Using the fourier theorem by computing DFT of the image and the array mentioned above, multiplying them and performing IDFT.
Directly through the derivative formula (Computing Fourier, multiplying by index and a constant and computing the inverse).
These derivative methods are all approximate and make different assumptions:
Convolution by [[-1, 1]] computes differences between adjacent elements,
derivative ~= data[n+1] − data[n]
You can interpret this like interpolating the data with a line segment, then taking the derivative of that interpolant:
I(x) = data[n] + (data[n+1] − data[n]) * (x − n)
So the approximation assumes the underlying function is locally linear. You can analyze the error by Taylor expansion to find that the error comes from the ignored higher-order terms. In other words, the approximation is accurate provided the function doesn't have strong nonlinear terms. This is a simple case of finite differences.
This is the same as 1, except with different boundary handling to handle convolution of samples near the edges of the image. By default, scipy.signal.convolve2d does zero padding (though you can use the boundary option to choose some other methods). However when computing the convolution through the DFT, then implicitly the boundary handling is periodic, wrapping around at the image edges. So the results of 1 and 2 differ for a margin of pixels near the edge because of the different boundary handling.
Computing the derivative through multiplying iω under the DFT representation can be interpreted like evaluating the derivative of the sinc interpolation the data. Sinc interpolation assumes the data is band limited. The error comes from spectra beyond the Nyquist frequency. Particularly, if there is a hard jump discontinuity from an object boundary, then the image is not bandlimited and the DFT-based derivative will have substantial error in the vicinity of the jump, appearing as ringing artifacts.
The main thing that baffles me is that when I try computing the median of this derivative, its ALWAYS 0.0.
I don't know why this happened here, but it shouldn't always be the case. For instance if each image row is the unit ramp data[n] = n, then the convolution by [[-1, 1]] is equal to 1 everywhere, except depending on boundary handling possibly not at the edges, so the median is 1.
Pascal already gave a wonderful explanation of the differences between the various approximations to the derivative. So I'll focus here on the "why always 0.0?" question.
The median of the derivative is 0.0 only by approximation. When I compute it, based on the finite difference approximation (method #1), I get -5.15e-5 as the median. Close to zero, but not exactly zero.
The derivative is 0 in uniform (flat) regions of the image such as the out-of-focus background. Other features in the image tend to have both a positive and a negative edge, making the histogram of the derivative image very symmetric:
This symmetry causes the median (as well as the mean) to be close to zero for such an image. However, this is not always the case. For example, if the image is brighter on the left edge than the right edge (or the other way around), then there must be a net gradient across the image, causing the mean or median to be different from zero.

FFT for 3d sensor signal

I have 3d-array of accelerator signal data which sampled in 50 Hz meaning that the time step is 1/50=.02. My goal is to compute the main frequency of this sensor using Numpy or Scipy. My question is that should I compute the frequency of each column separately, using multidimensional fft or computing single Vector and then compute fft.
I used the following function to compute the main frequency.
from scipy import fftpack
import numpy as np
def fourier(signal, timestep):
data = signal - np.mean(signal)
N = len(data) // 2 # we need half of data
freq = fftpack.fftfreq(len(data), d=timestep)[:N]
fft = fftpack.fft(data)[:N]
amp = np.abs(fft) / N
order = np.argsort(amp)[::-1] ## sort based on the importance
return freq[order][0]
A 3D array of accelerometer sensors produces an array of 5 dimensions: the space coordinates, time and the components of the acceleration.
Taking the DFT over the time dimension corresponds to analysing sensors one at a time: each sensor would produce a main frequency, likely slightly different from one sensor to another, as if the sensors were uncoupled.
As an alternative, let's think about taking the DFT over both spacial coordinates and time. It corresponds to writing the compound signal as a sum of sinusoidal plane waves:
where Ǹ is a scaling factor obtained by multiplying the number of points to the number of time samples. In the sequel, I'll drop this global scaling independent from x,y,z,t,k_x,k_y,k_z and w.
At this point, modeling the physics generating this acceleration would be a significant asset. Indeed, using this DFT makes little sense if the phenomenon is dispersive. Nevetheless, the diffusion, elasticity or acoustics in an uniform material are non-dispersive: each frequency lives indepently from the others. Furthermore, knowing the physics is useful as an energy can be defined. For instance, the kinetic energy associated to the wave k_x,k_y,k_z,w writes:
Therefore, the kinetic energy associated to a given frequency w writes:
As a consequence, this reasoning provides a physically-based way to merge the pointwise DFTs over time . Indeed, according to the Parseval's identity:
Regarding practical considerations, substracting the average as you did is indeed a good start. If computing the velocity is considered by multiplying by 1/w^2, the zero frequency (i.e. the average) is to be zeroed, to avoid occurence of infinite or Nan.
Moreover, applying a window prior to computing the time DFT could help limit problems related to spectral leakage. DFT is designed for periodic signals of periods consistent with that of the frame. More specifically, it computes the Fourier transform of a signal built by repeating your frame again and again. As a consequence, artifical discontinuities may appear at the edges, inducing misleading non-existing frequencies. Windows drops near zero close to the edge of the frame, thus reducing the discontinuities and their effect. As a consequence, it could be suggested to apply a window to the space dimensions as well, to keep the consistency with the physical plane wave decomposition. It would result in giving more weight to the accelerators at the center of the 3D array.
The plane wave decomposition also requires that the spacial spacing of the sensor must be about twice smaller than the expected wavelength. Otherwise, another phenomenon called aliasing occurs. Nevertheless, the power spectrum W(w) might be less sensitive to this issue than the plane wave decomposition. On the contrary, if the elastic strain energy is computed starting from the acceleration, aliasing could become a real problem, because computing the strain requires derivative with respect to space coordinates, i.e. multiplication by k_x, k_y or k_z, and space aliasing corresponds to using the wrong k_x.
Once W(w) is computed, the frequencies corresponding to each peak can be estimated by computing the mean frequency over the peak with respect to power density as in Why are frequency values rounded in signal using FFT? .
Here is a sample code generating some plane waves of frequencies not consistent with the size of the frame (both time and space). Hanning windows are applied, the kinetic energy is computed and the frequencies corresponding to each peak are retreived.
import matplotlib.pyplot as plt
import numpy as np
from scipy import signal
import scipy
spacingx=1.
spacingy=1.
spacingz=1.
spacingt=1./50.
Nx=5
Ny=5
Nz=5
Nt=512
frequency1=9.5
frequency2=13.7
frequency3=22.3
#building a signal
acc=np.zeros((Nx,Ny,Nz,Nt,3))
for i in range(Nx):
for j in range(Ny):
for k in range(Nz):
for l in range(Nt):
acc[i,j,k,l,0]=np.sin(i*spacingx+j*spacingy-2*np.pi*frequency1*l*spacingt)
acc[i,j,k,l,1]=np.sin(i*spacingx+1.5*k*spacingz-2*np.pi*frequency2*l*spacingt)
acc[i,j,k,l,2]=np.sin(1.5*i*spacingx+k*spacingz-2*np.pi*frequency3*l*spacingt)
#applying a window both in time and space
hanningx=np.hanning(Nx)
hanningy=np.hanning(Ny)
hanningz=np.hanning(Nz)
hanningt=np.hanning(Nt)
for i in range(Nx):
hx=hanningx[i]
for j in range(Ny):
hy=hanningy[j]
for k in range(Nz):
hz=hanningx[k]
for l in range(Nt):
ht=hanningt[l]
acc[i,j,k,l,0]*=hx*hy*hz*ht
acc[i,j,k,l,1]*=hx*hy*hz*ht
acc[i,j,k,l,2]*=hx*hy*hz*ht
#computing the DFT over time.
acctilde=np.fft.fft(acc,axis=3)
#kinetic energy
print acctilde.shape[3]
kineticW=np.zeros(acctilde.shape[3])
frequencies=np.fft.fftfreq(Nt, spacingt)
for l in range(Nt):
oneonomegasquared=0.
if l>0:
oneonomegasquared=1.0/(frequencies[l]*frequencies[l])
for i in range(Nx):
for j in range(Ny):
for k in range(Nz):
kineticW[l]+= oneonomegasquared*(np.real(np.vdot(acctilde[i,j,k,l,:],acctilde[i,j,k,l,:])))
plt.plot(frequencies[0:acctilde.shape[3]],kineticW,'k-',label=r'$W(f)$')
#plt.plot(xi,np.real(fourier),'k-', lw=3, color='red', label=r'$f$, Hz')
plt.legend()
plt.show()
# see https://stackoverflow.com/questions/54714169/why-are-frequency-values-rounded-in-signal-using-fft/54775867#54775867
peaks, _= signal.find_peaks(kineticW, height=np.max(kineticW)*0.1)
print "potential frequencies index", peaks
#compute the mean frequency of the peak with respect to power density
powerpeak=np.zeros(len(peaks))
powerpeaktimefrequency=np.zeros(len(peaks))
for i in range(len(kineticW)):
dist=1000
jnear=0
for j in range(len(peaks)):
if dist>np.abs(i-peaks[j]):
dist=np.abs(i-peaks[j])
jnear=j
powerpeak[jnear]+=kineticW[i]
powerpeaktimefrequency[jnear]+=kineticW[i]*frequencies[i]
powerpeaktimefrequency=np.divide(powerpeaktimefrequency,powerpeak)
print 'corrected frequencies', powerpeaktimefrequency

How do I scale an FFT-based cross-correlation such that its peak is equal to Pearson's rho

Description of the problem
FFT can be used to compute cross-correlation between two signals or images. To determine the delay or lag between two signals A and B, it suffices to locate the peak of:
IFFT(FFT(A)*conjugate(FFT(B)))
However, the amplitude of the peak is related to the amplitude of the frequency spectra of the individual signals. Thus to determine the Pearson correlation (rho), the amplitude of this peak must be scaled by the total energy in the two signals.
One way to do this is to normalize by the geometric mean of the individual autocorrelations. This gives a reasonable approximation of rho, especially when the delay between samples is small, but not the exact value.
I thought the reason for this error was that the Pearson correlation is only defined for the overlapping portions of the signal, whereas the normalization factor (the geometric mean of the two autocorrelation peaks) includes contributions from the non-overlapping portions. I considered two approaches for fixing this and producing an exact value for rho via FFT. In the first (called rho_exact_1 below), I trimmed the samples down to their overlapping portions and computed the normalization factor from those. In the second (called rho_exact_2 below), I computed the fraction of the measurements contained in the overlapping portion of the signals and multiplied the full-autocorrelation-normalization factor by that fraction.
Neither works! The figure below shows plots of the three approaches for calculating Pearson's rho using DFT-based cross-correlation. Only the region of the cross-correlation peak is shown. Each estimate is close to the correct value of 1.0, but not equal to it.
The code I used to perform the calculations is below. I used a simple sine wave as an example signal. I noticed that if I use a square-wave (w/ duty cycle not necessarily 50%) the approaches' errors change.
Can somebody explain what's going on?
A working example
import numpy as np
from matplotlib import pyplot as plt
# make a time vector w/ 256 points
# and a source signal
N_cycles = 10.0
N_points = 256.0
t = np.arange(0,N_cycles*np.pi,np.pi*N_cycles/N_points)
signal = np.sin(t)
use_rect = False
if use_rect:
threshold = -0.75
signal[np.where(signal>=threshold)]=1.0
signal[np.where(signal<threshold)]=-1.0
# normalize the signal (not technically
# necessary for this example, but required
# for measuring correlation of physically
# different signals)
signal = signal/signal.std()
# generate two samples of the signal
# with a temporal offset:
N = 128
offset = 5
sample_1 = signal[:N]
sample_2 = signal[offset:N+offset]
# determine the offset through cross-
# correlation
xc_num = np.abs(np.fft.ifft(np.fft.fft(sample_1)*np.fft.fft(sample_2).conjugate()))
offset_estimate = np.argmax(xc_num)
if offset_estimate>N//2:
offset_estimate = offset_estimate - N
# for an approximate estimate of Pearson's
# correlation, we normalize by the RMS
# of individual autocorrelations:
autocorrelation_1 = np.abs(np.fft.ifft(np.fft.fft(sample_1)*np.fft.fft(sample_1).conjugate()))
autocorrelation_2 = np.abs(np.fft.ifft(np.fft.fft(sample_2)*np.fft.fft(sample_2).conjugate()))
xc_denom_approx = np.sqrt(np.max(autocorrelation_1))*np.sqrt(np.max(autocorrelation_2))
rho_approx = xc_num/xc_denom_approx
print 'rho_approx',np.max(rho_approx)
# this is an approximation because we've
# included autocorrelation of the whole samples
# instead of just the overlapping portion;
# using cropped versions of the samples should
# yield the correct correlation:
sample_1_cropped = sample_1[offset:]
sample_2_cropped = sample_2[:-offset]
# these should be identical vectors:
assert np.all(sample_1_cropped==sample_2_cropped)
# compute autocorrelations of cropped samples
# and corresponding value for rho
autocorrelation_1_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_1_cropped)*np.fft.fft(sample_1_cropped).conjugate()))
autocorrelation_2_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_2_cropped)*np.fft.fft(sample_2_cropped).conjugate()))
xc_denom_exact_1 = np.sqrt(np.max(autocorrelation_1_cropped))*np.sqrt(np.max(autocorrelation_2_cropped))
rho_exact_1 = xc_num/xc_denom_exact_1
print 'rho_exact_1',np.max(rho_exact_1)
# alternatively we could try to use the
# whole sample autocorrelations and just
# scale by the number of pixels used to
# compute the numerator:
scaling_factor = float(len(sample_1_cropped))/float(len(sample_1))
rho_exact_2 = xc_num/(xc_denom_approx*scaling_factor)
print 'rho_exact_2',np.max(rho_exact_2)
# finally a sanity check: is rho actually 1.0
# for the two signals:
rho_corrcoef = np.corrcoef(sample_1_cropped,sample_2_cropped)[0,1]
print 'rho_corrcoef',rho_corrcoef
x = np.arange(len(rho_approx))
plt.plot(x,rho_approx,label='FFT rho_approx')
plt.plot(x,rho_exact_1,label='FFT rho_exact_1')
plt.plot(x,rho_exact_2,label='FFT rho_exact_2')
plt.plot(x,np.ones(len(x))*rho_corrcoef,'k--',label='Pearson rho')
plt.legend()
plt.ylim((.75,1.25))
plt.xlim((0,20))
plt.show()
The normalised cross correlation between two N-periodic discrete signals F and G is defined as:
Since the numerator is a dot product between two vectors (F and G_x) and the denominator is the product of the norm of these two vectors, the scalar r_x must indeed lie between -1 and +1 and it is the cosinus of the angle between the vectors (See there). If the vector F and G_x are aligned, then r_x=1. If r_x=1, then the vector F and G_x are aligned due to the triangular inequality. To ensure these properties, the vectors at the numerator must match those at the denominator.
All numerators can be computed at once by using the Discrete Fourier Transform. Indeed, that transform turns the convolution into pointwise products in the Fourier space. Here is why the different estimated normalized cross correlations are not 1 in the tests you performed.
For the first test "approx", sample_1 and sample_2 are both extracted from a periodic signal. Both are of the same length, but the length is not a multiple of the period as it is 2.5 periods (5pi) (figure below). As a result, since the dft performs the correlation as if they where periodic signals, it is found that sample_1 and sample_2 are not perfectly correlated and r_x<1.
For the second test rho_exact_1, the convolution is performed on signals of length N=128, but the norms at the denominator are computed on truncated vectors of size N-offset=128-5. As a result, the properties of r_x are lost. In addition, it must be noticed that the proposed convolution and norms are not normalized: the computed norms and convolution product are globally proportionnal to the number of points of the considered vectors. As a result, the norms of the truncated vectors are slightly lower compared to the previous case and r_x increases: values larger that 1 are likely encountered as the offset increases.
For the third test rho_exact_2, a scaling factor is introduced to try to correct the first test: the properties of r_x are also lost and values larger than one can be encountered as the scaling factor is larger than one.
Nevertheless, the function corrcoef() of numpy actually computes a r_x equal to 1 for the truncated signals. Indeed, these signals are perfectly identical! The same result can be obtained using DFTs:
xc_num_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_1_cropped)*np.fft.fft(sample_2_cropped).conjugate()))
autocorrelation_1_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_1_cropped)*np.fft.fft(sample_1_cropped).conjugate()))
autocorrelation_2_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_2_cropped)*np.fft.fft(sample_2_cropped).conjugate()))
xc_denom_exact_11 = np.sqrt(np.max(autocorrelation_1_cropped))*np.sqrt(np.max(autocorrelation_2_cropped))
rho_exact_11 = xc_num_cropped/xc_denom_exact_11
print 'rho_exact_11',np.max(rho_exact_11)
To provide the user with a significant value for r_x, you can stick to the value provided by the first test, which can be lower than one for identical periodic signals if the length of the frame is not a multiple of the period. To correct this drawback, the estimated offset can also be retreived and used to build two cropped signals of the same length. The whole correlation procedure must be re-run to get a new value for r_x, which will not be plaged by the fact that the length of the cropped frame is not a multiple of the period.
Lastly, if the DFT is a very efficient way to compute the convolution at the numerator for all values of x at once, the denominator can be efficiently computed as 2-norms of vector, using numpy.linalg.norm. Since the argmax(r_x) for the cropped signals will likely be zero if the first correlation was successful, it could be sufficient to compute r_0 using a dot product `sample_1_cropped.dot(sample_2_cropped).

When should I use fftshift(fft(fftshift(x))) and when fft(x)?

I am trying to implement an algorithm in python, but I am not sure when I should use fftshift(fft(fftshift(x))) and when only fft(x) (from numpy). Is there a rule of thumb based on the shape of input data?
I am using fftshift instead of ifftshift due to the even number of values in the vector x.
It really just depends on what you want. The DFT (and hence the FFT) is periodic in the frequency domain with period equal to 2pi.
The fft() function will return the approximation of the DFT with omega (radians/s) from 0 to pi (i.e. 0 to fs, where fs is the sampling frequency). All fftshift() does is swap the output vector of the fft() right down the middle. So the output of fftshift(fft()) is now from -pi/2 to pi/2.
Usually, people like to plot a good approximation of the DTFT (or maybe even the CTFT) using the FFT, so they zero-pad the input with a huge amount of zeros (the function fft() does this on it's own) and then they use the fftshift() function to plot between -pi and pi.
In other words, use fftshift(fft()) for plotting, and fft() for the math!
fft(fftshift(x)) rotates the input vector so the the phase of the complex FFT result is relative to the center of the original data window. If the input waveform is not exactly integer periodic in the FFT width, phase relative to the center of the original window of data may make more sense than the phase relative to some averaging between the discontinuous beginning and end. fft(fftshift(x)) also has the property that the imaginary component of a result will always be positive for a positive zero crossing at the center of the window of any antisymmetric waveform component.
fftshift(fft(y)) rotates the FFT results so that the DC bin is in the center of the result, halfway between -Fs/2 and Fs/2, which is a common spectrum display format.

Am I using Numpy to calculate the Inverse Filter correctly?

As part of a digital image processing class, we have been assigned the Inverse Filter for image restoration. I'm using numpy. The variable names below try to follow the names in Digital Image Processing Gonzalez+Woods, 3e.
A zoom of the original image.
.
Gaussian kernel "zz.tif" same size as original image.
Zoom of the gaussian smoothed image with no noise added
f = imtools.load_image( sys.argv[1], mode="L", dtype="float" )
zz = imtools.load_image( "zz.tif", mode="L", dtype="float" )
F = np.fft.fft2( f )
F2 = np.fft.fftshift( F )
# normalize to [0,1]
H = zz/255.
# calculate the damaged image
G = H * F2
# Inverse Filter
F_hat = G / H
# cheat? replace division by zero (NaN) with zeroes
a = np.nan_to_num(F_hat)
f_hat = np.fft.ifft2( np.fft.ifftshift(a) )
imtools.save_image( np.abs(f_hat), "out.tif" )
imtools is just my wrapper using PIL+numpy to load/store images. (Can post that src, too.)
Zoom of the inverse filtered image.
Am I calculating the Inverse Filter correctly? Am I using numpy correctly?
Is the ringing in the final image expected or am I doing something wrong?
Generally, yes you seem to be doing things correctly, as far as I know.
The ringing is due to an overly "sharp" high pass filter, but that's what the method you're using does.
However, you might consider using numpy.fft.rfft2 ("real fft") and numpy.fft.irfft2 instead of numpy.fft.fft2 and numpy.fft.ifft2 because you're dealing purely with real values. It should be slightly faster.
I don't know much about Python but the 'ringing' is normal for the inverse filter. The Gibbs phenomenon lies at the basis of the ringing. Since the input is not entirely smooth but has some discontinuities, an infinite number of Fourier components is in principle needed to represent it completely. A finite number of components is sufficient here since the display resolution is finite, the image is pixelated. However, some information is lost in the recorded image because of the multiplication by zeros in H, by consequence the restored image approximates the input image with components covering a finite bandwidth, lower than that of the display, revealing the Gibbs oscillations.
To mitigate this use proper regularization as with a 2D Wiener filter: F_hat=G * H.conjugate()/(abs(H)2+NSR2) where NSR is an estimate of the noise to signal ratio, e.g. linearly increasing from 0 to 10 at the highest spatial frequency. This will account for the finite signal to noise ratio and when the NSR estimate is close enough you should see little 'ringing' after restoration.

Categories

Resources