I am trying to get the phase distribution of a 2D aperture using FFT.
The input is a circle, where everything inside the circle has value 1, outside it has value 0.
In order to make a good transform, I use an input array that is 200x as large as the radius of the circle, and make a 5000x5000 grid out of it. This ensures that the circle is actually circular and there is enough room around in order that no Nyquist things happen.
I need to 2D Fourier transform the aperture and then calculate the phase of the Fourier transform at every point.
The function I use for creating the input (aperture):
creating the input aperture
Next do the numpy fft2 2D fourier transform:
Fourier transforming aperture
And the result of this is a 2D complex array (as expected!), BUT with the imaginary parts so much much much smaller than the real parts (17 orders of magnitude difference imaginary parts ~10E-17).
This is not expected and most probably wrong. What went wrong?
The FFT phase result of a perfectly symmetric input is zero, e.g. a strictly real result, thus atan2(Im,Re) == 0 , (imaginary components all zero, except for rounding noise).
(even symmetry with respect to (0,0) circularly, or to (n/2,n/2))
The phase will become non-zero (thus a non-zero imaginary component in the FFT result) when the input is moved off center or otherwise made non-symmetric.
Related
I computed derivatives using different methods such as :
convolution with an array [[-1, 1]].
Using the fourier theorem by computing DFT of the image and the array mentioned above, multiplying them and performing IDFT.
Directly through the derivative formula (Computing Fourier, multiplying by index and a constant and computing the inverse).
All methods seem to work almost identically, but have slight differences.
An explanation why they end up with slightly different results would be appreciated.
After computing those I started playing with the result to learn about it, and I found out something that confused me:
The main thing that baffles me is that when I try computing the median of this derivative, its ALWAYS 0.0.
Why is that?
I added the code I used to compute this (the first method at least) because maybe I'm doing something wrong.
from scipy.signal import convolve2d
im = sl.read_image(r'C:\Users\ahhal\Desktop\Essentials\Uni\year3\SemesterA\ImageProcessing\Exercises\Ex2\external\monkey.jpg', 1)
b = [[-1, 1]]
print(np.median(convolve2d(im, b)))
output: 0.0
The read_image function is my own and this is the implementation:
from imageio import imread
from skimage.color import rgb2gray
import numpy as np
def read_image(filename, representation):
"""
Receives an image file and converts it into one of two given representations.
:param filename: The file name of an image on disk (could be grayscale or RGB).
:param representation: representation code, either 1 or 2 defining wether the output
should be a grayscale image (1) or an RGB image (2). If the input image is grayscale,
we won't call it with representation = 2.
:return: An image, represented by a matrix of type (np.float64) with intensities
normalized to the range [0,1].
"""
assert representation in [1, 2]
# reads the image
im = imread(filename)
if representation == 1: # If the user specified they need grayscale image,
if len(im.shape) == 3: # AND the image is not grayscale yet
im = rgb2gray(im) # convert to grayscale (**Assuming its RGB and not a different format**)
im_float = im.astype(np.float64) # Convert the image type to one we can work with.
if im_float.max() > 1: # If image values are out of bound, normalize them.
im_float = im_float / 255
return im_float
Edit 2:
I tried it on several different images, and got 0.0 at all of them.
The image I'm using in the example is:
I computed derivatives using different methods such as :
convolution with an array [[-1, 1]].
Using the fourier theorem by computing DFT of the image and the array mentioned above, multiplying them and performing IDFT.
Directly through the derivative formula (Computing Fourier, multiplying by index and a constant and computing the inverse).
These derivative methods are all approximate and make different assumptions:
Convolution by [[-1, 1]] computes differences between adjacent elements,
derivative ~= data[n+1] ā data[n]
You can interpret this like interpolating the data with a line segment, then taking the derivative of that interpolant:
I(x) = data[n] + (data[n+1] ā data[n]) * (x ā n)
So the approximation assumes the underlying function is locally linear. You can analyze the error by Taylor expansion to find that the error comes from the ignored higher-order terms. In other words, the approximation is accurate provided the function doesn't have strong nonlinear terms. This is a simple case of finite differences.
This is the same as 1, except with different boundary handling to handle convolution of samples near the edges of the image. By default, scipy.signal.convolve2d does zero padding (though you can use the boundary option to choose some other methods). However when computing the convolution through the DFT, then implicitly the boundary handling is periodic, wrapping around at the image edges. So the results of 1 and 2 differ for a margin of pixels near the edge because of the different boundary handling.
Computing the derivative through multiplying iĻ under the DFT representation can be interpreted like evaluating the derivative of the sinc interpolation the data. Sinc interpolation assumes the data is band limited. The error comes from spectra beyond the Nyquist frequency. Particularly, if there is a hard jump discontinuity from an object boundary, then the image is not bandlimited and the DFT-based derivative will have substantial error in the vicinity of the jump, appearing as ringing artifacts.
The main thing that baffles me is that when I try computing the median of this derivative, its ALWAYS 0.0.
I don't know why this happened here, but it shouldn't always be the case. For instance if each image row is the unit ramp data[n] = n, then the convolution by [[-1, 1]] is equal to 1 everywhere, except depending on boundary handling possibly not at the edges, so the median is 1.
Pascal already gave a wonderful explanation of the differences between the various approximations to the derivative. So I'll focus here on the "why always 0.0?" question.
The median of the derivative is 0.0 only by approximation. When I compute it, based on the finite difference approximation (method #1), I get -5.15e-5 as the median. Close to zero, but not exactly zero.
The derivative is 0 in uniform (flat) regions of the image such as the out-of-focus background. Other features in the image tend to have both a positive and a negative edge, making the histogram of the derivative image very symmetric:
This symmetry causes the median (as well as the mean) to be close to zero for such an image. However, this is not always the case. For example, if the image is brighter on the left edge than the right edge (or the other way around), then there must be a net gradient across the image, causing the mean or median to be different from zero.
Is it somehow possible to determine the array length of the arrays in the tck tuple returned by scipy.interpolate.splprep before computing the values?
I have to fit a spline interpolation to noisy data with 5 million data points (or less, can be varying).
My observation is that the interpolation at an array length of ~ 90 is pretty good, while it takes a long time to compute the interpolation for higher array lengths (it sometimes also directly jumps from ~ 90 to ~ 1000 while making s one step smaller and the interpolation also becomes noisy) and it is not appropriate enough, if the array length is far less (<50)...
Actually, this array length depends on the smoothing factor s provided to the splprep function, but for different measurement data, s varies a lot to get a consistent array length of around 90. E.g. for data1 s has a value of around 1000 to get len(cfk[0]) equals to 90, for data2 s has a value of around 100 to get len(cfk[0]) equals to 90 at same lengths of data1 and data2. It might be dependent on the noise of the data...
I have thought about a loop where s starts at some point and decreases through the loop while len(cfk[0]) is constantly being checked - but this takes ages, especially if len(cfk[0]) gets closer to 90.
Therefore, it would be useful to somehow know the smoothing factor to get the desired array length before computing the cfk tuple.
Short answer: no, not easily. Dierckx Fortran library, which splrep wraps, uses some fairly non-trivial logic for determining the knot vector, and it's all baked into the Fortran code. So, the only way is to carefully trace the latter. It's available from netlib, also scipy/interpolate/fitpack
I'm currently computing the spectrogram with the matplotlib. I specify NFFT=512 but the resulting image has a height of 257. I then tried to just do a STFT (short time fourier transform) which gives me 512 dimensional vectors (as expected). If I plot the result of the STFT I can see that half of the 512 values are just mirrored so really I only get 257 values (like the matplotlib). Can somebody explain to me why that is the case? I always thought of the FT as a basis transform, why would it introduce this redundancy?
Thank you.
The redundancy is because you input a strictly real signal to your FFT, thus the DFT result is complex conjugate (Hermitian) symmetric. This redundancy is due to the fact that all the imaginary components of strictly real input are zero. But the output of this DFT can include non-zero imaginary components to indicate phase. Thus, the this DFT result has to be conjugate symmetric so that all the imaginary components in the result will cancel out between the two DFT result halves (same magnitudes, but opposite phases), indicating strictly real input. Also, the lower 257 bins of the basis transform will have 512 degrees of (scaler)freedom, just like the input. However, a spectrogram throws away all phase information, so it can only display 257 unique values (magnitude-only).
If you input a complex (quadrature, for instance) signal to a DFT, then there would likely not be Hermitian redundancy, and you would have 1024 degrees of freedom from a 512 length DFT.
If you want an image height of 512 (given real input), try an FFT size of 1024.
I am trying to implement an algorithm in python, but I am not sure when I should use fftshift(fft(fftshift(x))) and when only fft(x) (from numpy). Is there a rule of thumb based on the shape of input data?
I am using fftshift instead of ifftshift due to the even number of values in the vector x.
It really just depends on what you want. The DFT (and hence the FFT) is periodic in the frequency domain with period equal to 2pi.
The fft() function will return the approximation of the DFT with omega (radians/s) from 0 to pi (i.e. 0 to fs, where fs is the sampling frequency). All fftshift() does is swap the output vector of the fft() right down the middle. So the output of fftshift(fft()) is now from -pi/2 to pi/2.
Usually, people like to plot a good approximation of the DTFT (or maybe even the CTFT) using the FFT, so they zero-pad the input with a huge amount of zeros (the function fft() does this on it's own) and then they use the fftshift() function to plot between -pi and pi.
In other words, use fftshift(fft()) for plotting, and fft() for the math!
fft(fftshift(x)) rotates the input vector so the the phase of the complex FFT result is relative to the center of the original data window. If the input waveform is not exactly integer periodic in the FFT width, phase relative to the center of the original window of data may make more sense than the phase relative to some averaging between the discontinuous beginning and end. fft(fftshift(x)) also has the property that the imaginary component of a result will always be positive for a positive zero crossing at the center of the window of any antisymmetric waveform component.
fftshift(fft(y)) rotates the FFT results so that the DC bin is in the center of the result, halfway between -Fs/2 and Fs/2, which is a common spectrum display format.
I want dimensionality reduction such that dimensions it returns are circular.
ex) If I reduce 12d data to 2d, normalized between 0 and 1, then I want (0,0) to be as equally close to (.1,.1) as (.9,.9).
What is my algorithm? (bonus points for python implementation)
PCA gives me 2d plane of data, whereas I want spherical surface of data.
Make sense? Simple? Inherent problems? Thanks.
I think what you ask is all about transformation.
Circular
I want (0,0) to be as equally close to (.1,.1) as (.9,.9).
PCA
Taking your approach of normalization what you could do is to
map the values in the interval from [0.5, 1] to [0.5, 0]
MDS
If you want to use a distance metric, you could first compute the distances and then do the same. For instance taking the correlation, you could do 1-abs(corr). Since the correlation is between [-1, 1] positive and negative correlations will give values close to zero, while non correlated data will give values close to one. Then, having computed the distances you use MDS to get your projection.
Space
PCA gives me 2d plane of data, whereas I want spherical surface of data.
Since you want a spherical surface you can directly transform the 2-d plane to a sphere as I think. A spherical coordinate system with a constant Z would do that, wouldn't it?
Another question is then: Is all this a reasonable thing to do?