FFT normalization with numpy - python

Just started working with numpy package and started it with the simple task to compute the FFT of the input signal. Here's the code:
import numpy as np
import matplotlib.pyplot as plt
#Some constants
L = 128
p = 2
X = 20
x = np.arange(-X/2,X/2,X/L)
fft_x = np.linspace(0,128,128, True)
fwhl = 1
fwhl_y = (2/fwhl) \
*(np.log([2])/np.pi)**0.5*np.e**(-(4*np.log([2]) \
*x**2)/fwhl**2)
fft_fwhl = np.fft.fft(fwhl_y, norm='ortho')
ampl_fft_fwhl = np.abs(fft_fwhl)
plt.bar(fft_x, ampl_fft_fwhl, width=.7, color='b')
plt.show()
Since I work with an exponential function with some constant divided by pi before it, I expect to get the exponential function in Fourier space, where the constant part of the FFT is always equal to 1 (zero frequency).
But the value of that component I get using numpy is larger (it's about 1,13). Here I have an amplitude spectrum which is normalized by 1/(number_of_counts)**0.5 (that's what I read in numpy documentation). I can't understand what's wrong... Can anybody help me?
Thanks!
[EDITED] It seems like the problem is solved, all you need to get the same result of Fourier integral and of FFT is to multiply FFT by the step (in my case it's X/L). And as for normalization as option of numpy.fft.fft(..., norm='ortho'), it's used only to save the scale of the transform, otherwise you'll need to divide the result of the inverse FFT by the number of samples. Thanks everyone for their help!

I've finally solved my problem. All you need to bond FFT with Fourier integral is to multiply the result of the transform (FFT) by the step (X/L in my case, FFTX/L), it works in general. In my case it's a bit more complex since I have an extra rule for the function to be transformed. I have to be sure that the area under the curve is equal to 1, because it's a model of δ function, so since the step is unchangeable, I have to fulfill stepsum(fwhl_y)=1 condition, that is X/L=1/sum(fwhl_y). So to get the correct result I have to make following things:
to calculate FFT fft_fwhl = np.fft.fft(fwhl_y)
to get rid of phase component which comes due to the symmetry of fwhl_y function, that is the function defined in [-T/2,T/2] interval, where T is period and np.fft.fft operation thinks that my function is defined in [0,T] interval. So to get amplitude spectrum only (that's what I need) I simply use np.abs(FFT)
to get the values I expect I should multiply the result I got on previous step by X/L, that is np.abs(FFT)*X/L
I have an extra condition on the area under the curve, so it's X/L*sum(fwhl_y)=1 and I finally come to np.abs(FFT)*X/L = np.abs(FFT)/sum(fwhl_y)
Hope it'll help anyone at least.

Here's a possible solution to your problem:
import numpy as np
import matplotlib.pyplot as plt
from scipy import fft
from numpy import log, pi, e
# Signal setup
Fs = 150
Ts = 1.0 / Fs
t = np.arange(0, 1, Ts)
ff = 50
fwhl = 1
y = (2 / fwhl) * (log([2]) / pi)**0.5 * e**(-(4 * log([2]) * t**2) / fwhl**2)
# Plot original signal
plt.subplot(2, 1, 1)
plt.plot(t, y, 'k-')
plt.xlabel('time')
plt.ylabel('amplitude')
# Normalized FFT
plt.subplot(2, 1, 2)
n = len(y)
k = np.arange(n)
T = n / Fs
frq = k / T
freq = frq[range(n / 2)]
Y = np.fft.fft(y) / n
Y = Y[range(n / 2)]
plt.plot(freq, abs(Y), 'r-')
plt.xlabel('freq (Hz)')
plt.ylabel('|Y(freq)|')
plt.show()
With fwhl=1:
With fwhl=0.1:
You can see in the above graphs how the exponential & FFT plots varies when fwhl is close to 0

Related

How to correctly scale the FFT python function in order to validate it with the rect() and sinc() function pair?

I am currently working on an python implementation that is using the FFT to convert signals in the time domain to signals in the frequency domain and the other way around. In order to validate my FFT function, I’ve tried to use the rectangular and sinc function pair given by
def rect(t, T): #rectangular function
ans = np.zeros(len(t))
for i in range(len(t)):
if abs(t[i]) < T/2:
ans[i] = 1
return ans
and
def Tsinc_hz(f, T): # sinc function
return np.sin(np.pi * f * T) / (np.pi * f)
Unfortunately, the results are unsatisfying and I don’t quit get why. I hope someone is able to help me.
Here is my FFT code:
def to_frequency_domain(t, ys):
last = t[-1] # length of time sample, so is eual to limit
N = len(t) # number of samples in the time domain
T = last/N # 1/Sample rate
xf = fftfreq(N, T)
xf = fftshift(xf)
yf = fft(ys, N)
yf = fftshift(yf)
return xf, yf
def to_time_domain(x, y):
N = len(x) # Number of samples in the time domain
last = (len(x)-1) / (2*x[-1]) # length in the time domain
T = last/(N-1) # 1/sample rate
ys =ifft(ifftshift(y), N)
return ys
The following code is supposed to validate the FFT implementation by comparing the analytical functions given above with the FFT output in time and frequency domain.
t = np.linspace(0, 10, 250*10)
ans = rect(t, 1)
x_val,y_val = to_frequency_domain(t, ans)
f = np.linspace(-125, 125, 2500)
proof = Tsinc_hz(f, 1)
# sinc function comparison plot
plt.plot(x_val, y_val.real)
plt.plot(f,proof.real, c='r')
plt.show()
proof_time = to_time_domain(f, proof)
# rectangular function comparison plot
plt.plot(t, proof_time, linewidth = 3, c = 'r')
plt.plot(t, ans)
plt.show()
Running the code gives the following plot:
sinc function comparison plot and rectangular function comparison plot
It is obvious that the there is a scaling problem. I know that the frequency domain is dependent on the sample rate of my time domain. In this case I have used a sample rate of 250 and 2500 data points, meaning that I have 0,1 hz per frequency bin and thus my x-axis in the frequency domain reach form - 125 to 125. I was wondering if I can also formulate a relation between the frequency power and the sample rate. I was thinking that if I keep my data points constant and reduce my sampling rate the function is obviously jammed along the x-axis. Can this somehow result in stretching along the y-axis?
Furthermore, the transfer of the sinc-function in the second plot (proof_time) is mirrored.
Following, I have tried to fix my first problem by dividing the FFT output by the sampling rate
def to_frequency_domain(t, ys):
last = t[-1] # length of time sample, so is eual to limit
N = len(t) # number of samples in the time domain
T = last/N # 1/Sample rate
xf = fftfreq(N, T)
xf = fftshift(xf)
yf = fft(ys, N)*T
yf = fftshift(yf)
return xf, yf
def to_time_domain(x, y):
N = len(x) # Number of samples in the time domain
last = (len(x)-1) / (2*x[-1]) # length in the time domain
T = last/(N-1) # 1/sample rate
ys =ifft(ifftshift(y)/T, N)
return ys
which gives:
sinc function comparison plot_scaled, rectangular function comparison plot_scaled
This time both functions are obviously closer together but the result is still not perfect.
Additional, I am a little confused by the output of the FFT when I define my time domain to be symmetrical about zero t =np.linespace(-10,10, 250*20))
sinc function comparison plot
Why is the blue curve doubled like that? Supposedly because we have a negative and positive frequency component for every value in time, right? But how do I fix that?
I have tried figuring it out for a while now but just can’t seem to solve the problem, so I am very grateful for every tip!
Thanks in advance!
My guess is that you have to scale by d_omega= d_f/(2*pi) as the fft assumes a time step of 1 by default while you are using a different time step. If i add
df = xf[1] - xf[0]
domega = df / 2 / np.pi
yf = fft(ys, N) * domega
it scales.

Applying a half-gaussian filter to binned time series data in python

I am binning some time series data, I need to apply a half-normal filter to the binned data. How can I do this in python? I've provided a toy example bellow. I need Xbinned to be smoothed with a half-gaussian filter with std of 0.25 (or what ever). I'm pretty sure the half gaussian should be facing the forward time direction.
import numpy as np
X = np.random.randint(2, size=100) #example random process
bin_size = 5
Xbinned = []
for i in range(0, len(X)+1, bin_size):
Xbinned.append(sum(X[i:i+(bin_size-1)])/bin_size)
How to implement half-gaussian filtering
Scipy has a function called scipy.ndimage.gaussian_filter(). It nearly implements what we want here. Unfortunately, there's no option to use a half-gaussian instead of a gaussian. However, scipy is open-source, so we can just take the source code and modify it to be a half-gaussian.
I used this source code, and removed all of the parts that are not needed for this particular case. At the end, I had this:
import scipy.ndimage
def halfgaussian_kernel1d(sigma, radius):
"""
Computes a 1-D Half-Gaussian convolution kernel.
"""
sigma2 = sigma * sigma
x = np.arange(0, radius+1)
phi_x = np.exp(-0.5 / sigma2 * x ** 2)
phi_x = phi_x / phi_x.sum()
return phi_x
def halfgaussian_filter1d(input, sigma, axis=-1, output=None,
mode="constant", cval=0.0, truncate=4.0):
"""
Convolves a 1-D Half-Gaussian convolution kernel.
"""
sd = float(sigma)
# make the radius of the filter equal to truncate standard deviations
lw = int(truncate * sd + 0.5)
weights = halfgaussian_kernel1d(sigma, lw)
origin = -lw // 2
return scipy.ndimage.convolve1d(input, weights, axis, output, mode, cval, origin)
A short summary of how this works:
First, it generates a convolution kernel. It uses the formula e^(-1/2 * (x/sigma)^2) to generate the gaussian distribution. It keeps going until you're 4 standard deviations away from the center.
Next, it convolves that kernel against your signal. It adjusts the kernel to start at the current timestep instead of being centered on the current timestep.
Trying this on your signal, I get a result like this:
array([0.59979879, 0.6 , 0.40006707, 0.59993293, 0.79993293,
0.40013414, 0.20006707, 0.59986586, 0.40006707, 0.4 ,
0.99979879, 0.00033535, 0.59979879, 0.40006707, 0.00013414,
0.59979879, 0.20013414, 0.00006707, 0.19993293, 0.59986586])
Choice of standard deviation
If you pick a standard deviation of 0.25, that is going to have almost no effect on your signal. Here are the convolution weights it uses: [0.99966465 0.00033535]. In other words, this has less than a 0.1% effect on the signal.
I'd recommend using a larger sigma value.
Off by one error
Also, I want to point out the off-by-one error here:
for i in range(0, len(X)+1, bin_size):
Xbinned.append(sum(X[i:i+(bin_size-1)])/bin_size)
Numpy ranges are not inclusive, so a range of i to i+(bin_size-1) actually captures 4 elements, not 5.
To fix this, you can change it to this:
for i in range(0, len(X), bin_size):
Xbinned.append(X[i:i+bin_size].mean())
(Also, I fixed an off-by-one error in the loop specification and used a numpy shortcut for finding the mean.)

Fourier series data fit with numpy: fft vs coding

Suppose I have some data, y, to which I would like to fit a Fourier series. On this post, a solution was posted by Mermoz using the complex format of the series and "calculating the coefficient with a riemann sum". On this other post, the series is obtained through the FFT and an example is written down.
I tried implementing both approaches (image and code below - notice everytime the code is run, different data will be generated due to the use of numpy.random.normal) but I wonder why I am getting different results - the Riemann approach seems "wrongly shifted" while the FFT approach seems "squeezed". I am also not sure about my definition of the period "tau" for the series. I appreciate the attention.
I am using Spyder with Python 3.7.1 on Windows 7
Example
import matplotlib.pyplot as plt
import numpy as np
# Assume x (independent variable) and y are the data.
# Arbitrary numerical values for question purposes:
start = 0
stop = 4
mean = 1
sigma = 2
N = 200
terms = 30 # number of terms for the Fourier series
x = np.linspace(start,stop,N,endpoint=True)
y = np.random.normal(mean, sigma, len(x))
# Fourier series
tau = (max(x)-min(x)) # assume that signal length = 1 period (tau)
# From ref 1
def cn(n):
c = y*np.exp(-1j*2*n*np.pi*x/tau)
return c.sum()/c.size
def f(x, Nh):
f = np.array([2*cn(i)*np.exp(1j*2*i*np.pi*x/tau) for i in range(1,Nh+1)])
return f.sum()
y_Fourier_1 = np.array([f(t,terms).real for t in x])
# From ref 2
Y = np.fft.fft(y)
np.put(Y, range(terms+1, len(y)), 0.0) # zero-ing coefficients above "terms"
y_Fourier_2 = np.fft.ifft(Y)
# Visualization
f, ax = plt.subplots()
ax.plot(x,y, color='lightblue', label = 'artificial data')
ax.plot(x, y_Fourier_1, label = ("'Riemann' series fit (%d terms)" % terms))
ax.plot(x,y_Fourier_2, label = ("'FFT' series fit (%d terms)" % terms))
ax.grid(True, color='dimgray', linestyle='--', linewidth=0.5)
ax.set_axisbelow(True)
ax.set_ylabel('y')
ax.set_xlabel('x')
ax.legend()
Performing two small modifications is sufficient to make the sums nearly similar to the output of np.fft. The FFTW library indeed computes these sums.
1) The average of the signal, c[0] is to be accounted for:
f = np.array([2*cn(i)*np.exp(1j*2*i*np.pi*x/tau) for i in range(0,Nh+1)]) # here : 0, not 1
2) The output must be scaled.
y_Fourier_1=y_Fourier_1*0.5
The output seems "squeezed" because the high frequency components have been filtered. Indeed, the high frequency oscillations of the input have been cleared and the output looks like a moving average.
Here, tau is actually defined as stop-start: it corresponds to the length of the frame. It is the expected period of the signal.
If the frame does not correspond to a period of the signal, you can guess its period by convoluting the signal with itself and finding the first maximum. See
Find period of a signal out of the FFT Nevertheless, it is unlikely to work properly with a dataset generated by numpy.random.normal : this is an Additive White Gaussian Noise. As it features a constant power spectral density, it can hardly be discribed as periodic!

Using scipy.fftpack.fft how to interprete numerical result of Fourier Transform

The analytical Fourier transform of a sinusoidal signal is purely imginary. However, when numerically computing discrete Fourier transform, the result is not.
Tldr: Find all answers to this question here.
Consider therefore the following code
import matplotlib.pyplot as plt
import numpy as np
from scipy.fftpack import fft, fftfreq
f_s = 200 # Sampling rate = number of measurements per second in [Hz]
t = np.arange(0,10000, 1 / f_s)
N = len(t)
A = 4 # Amplitude of sinus signal
x = A * np.sin(t)
X = fft(x)[1:N//2]
freqs = (fftfreq(len(x)) * f_s)[1:N//2]
fig, (ax1,ax2) = plt.subplots(2,1, sharex = True)
ax1.plot(freqs, X.real, label = "$\Re[X(\omega)]$")
ax1.plot(freqs, X.imag, label = "$\Im[X(\omega)]$")
ax1.set_title("Discrete Fourier Transform of $x(t) = A \cdot \sin(t)$")
ax1.legend()
ax1.grid(True)
ax2.plot(freqs, np.abs(X), label = "$|X(\omega)|$")
ax2.legend()
ax2.set_xlabel("Frequency $\omega$")
ax2.set_yscale("log")
ax2.grid(True, which = "both")
ax2.set_xlim(0.15,0.175)
plt.show()
Clearly, the absolute value |X(w)| can be used as good approximation to the analytical result. However, the imaginary and real value of the function X(w) are different. Already another question on SO mentioned this fact, but did not explain why. So I can only use the absolute value and the phase?
Another question would be how the Amplitude is related to the numerical result. Mathematically speaking it should be the integral under the curve of |X(w)| divided by normalization (which, as far as I understood, should be given by N), i.e. approximately by
A_approx = np.sum(np.abs(X)) / N
print(f"Numerical value: {A_approx:.1f}, Correct value: {A:.1f}")
Numerical value: 13.5, Correct value: 4.0
This does not seem to be the case. Any insights? Ideas?
Related questions which did not help are here and here.
An FFT does not produce the result you expect because it is finite in length, and thus more similar to the Fourier Transform of a rectangular window on your sinusoid. The length and placement of this rectangular window will affect the phase and amplitude of the FFT result.

Fourier transform of a Gaussian is not a Gaussian, but thats wrong! - Python

I am trying to utilize Numpy's fft function, however when I give the function a simple gausian function the fft of that gausian function is not a gausian, its close but its halved so that each half is at either end of the x axis.
The Gaussian function I'm calculating is
y = exp(-x^2)
Here is my code:
from cmath import *
from numpy import multiply
from numpy.fft import fft
from pylab import plot, show
""" Basically the standard range() function but with float support """
def frange (min_value, max_value, step):
value = float(min_value)
array = []
while value < float(max_value):
array.append(value)
value += float(step)
return array
N = 256.0 # number of steps
y = []
x = frange(-5, 5, 10/N)
# fill array y with values of the Gaussian function
cache = -multiply(x, x)
for i in cache: y.append(exp(i))
Y = fft(y)
# plot the fft of the gausian function
plot(x, abs(Y))
show()
The result is not quite right, cause the FFT of a Gaussian function should be a Gaussian function itself...
np.fft.fft returns a result in so-called "standard order": (from the docs)
If A = fft(a, n), then A[0]
contains the zero-frequency term (the
mean of the signal), which is always
purely real for real inputs. Then
A[1:n/2] contains the
positive-frequency terms, and
A[n/2+1:] contains the
negative-frequency terms, in order of
decreasingly negative frequency.
The function np.fft.fftshift rearranges the result into the order most humans expect (and which is good for plotting):
The routine np.fft.fftshift(A)
shifts transforms and their
frequencies to put the zero-frequency
components in the middle...
So using np.fft.fftshift:
import matplotlib.pyplot as plt
import numpy as np
N = 128
x = np.arange(-5, 5, 10./(2 * N))
y = np.exp(-x * x)
y_fft = np.fft.fftshift(np.abs(np.fft.fft(y))) / np.sqrt(len(y))
plt.plot(x,y)
plt.plot(x,y_fft)
plt.show()
Your result is not even close to a Gaussian, not even one split into two halves.
To get the result you expect, you will have to position your own Gaussian with the center at index 0, and the result will also be positioned that way. Try the following code:
from pylab import *
N = 128
x = r_[arange(0, 5, 5./N), arange(-5, 0, 5./N)]
y = exp(-x*x)
y_fft = fft(y) / sqrt(2 * N)
plot(r_[y[N:], y[:N]])
plot(r_[y_fft[N:], y_fft[:N]])
show()
The plot commands split the arrays in two halfs and swap them to get a nicer picture.
It is being displayed with the center (i.e. mean) at coefficient index zero. That is why it appears that the right half is on the left, and vice versa.
EDIT: Explore the following code:
import scipy
import scipy.signal as sig
import pylab
x = sig.gaussian(2048, 10)
X = scipy.absolute(scipy.fft(x))
pylab.plot(x)
pylab.plot(X)
pylab.plot(X[range(1024, 2048)+range(0, 1024)])
The last line will plot X starting from the center of the vector, then wrap around to the beginning.
A fourier transform implicitly repeats indefinitely, as it is a transform of a signal that implicitly repeats indefinitely. Note that when you pass y to be transformed, the x values are not supplied, so in fact the gaussian that is transformed is one centred on the median value between 0 and 256, so 128.
Remember also that translation of f(x) is phase change of F(x).
Following on from Sven Marnach's answer, a simpler version would be this:
from pylab import *
N = 128
x = ifftshift(arange(-5,5,5./N))
y = exp(-x*x)
y_fft = fft(y) / sqrt(2 * N)
plot(fftshift(y))
plot(fftshift(y_fft))
show()
This yields a plot identical to the above one.
The key (and this seems strange to me) is that NumPy's assumed data ordering --- in both frequency and time domains --- is to have the "zero" value first. This is not what I'd expect from other implementations of FFT, such as the FFTW3 libraries in C.
This was slightly fudged in the answers from unutbu and Steve Tjoa above, because they're taking the absolute value of the FFT before plotting it, thus wiping away the phase issues resulting from not using the "standard order" in time.

Categories

Resources