I'm plotting sine waves (left column) and their respective frequency domain representations (right column):
The first wave (amplitude: 10; frequency: 0.5) has a somewhat messed up fft representation
The second wave (amplitude: 15; frequency: 5.0) looks absolutely as expected.
The third wave is just the first and the second wave summed up and inherits the problems
The second frequency plot has exactly one peak at x=5 (frequency), y=15 (amplitude).
Why does the first frequency plot have multiple peaks when there's only one frequency?
import numpy as np
import matplotlib.pyplot as plt
def sine(freq, time_interval, rate, amp=1):
w = 2. * np.pi * freq
t = np.linspace(0, time_interval, time_interval*rate)
y = amp*np.sin(w * t)
return y
def buildData():
secs = 3
Fs = 44100
# frequency, duration, sampling rate, amplitude
y1 = sine(0.5, secs, Fs, 10)
y2 = sine(5, secs, Fs, 15)
y3 = y1 + y2
signals = [y1, y2, y3]
showSignals(signals, Fs, secs)
def showSignals(signals, fs, secs):
nrSigs = len(signals)
fig = plt.figure()
fig.subplots_adjust(hspace=.5)
for i in range(len(signals)):
cols=2
pltIdc = []
for col in range(1,cols+1):
pltIdc.append(i*cols+col)
s = signals[i]
t = np.arange(0, secs, 1.0/fs)
ax1 = plt.subplot(nrSigs, cols, pltIdc[0])
ax1.set_title('signal')
ax1.set_xlabel('time')
ax1.set_ylabel('amplitude')
ax1.plot(t, s)
amps = 2*abs(np.fft.fft(s))/len(s) # scaled power spectrum
amps = amps[0:len(amps)/2] # because of the symmetry
amps = amps[0:50] # only the first 50 frequencies, arbitrarily chosen
# this should be close to the amplitude:
print 'magnitude of amplitudes: ' + str(sum(amps*amps)**0.5)
freqs=np.arange(0, len(amps), 1)/secs
ax2 = plt.subplot(nrSigs, cols, pltIdc[1])
ax2.grid(True)
ax2.set_title(r"$\frac{2 \cdot fft(s)}{len(s)}$")
ax2.set_xlabel('freq')
ax2.set_ylabel('amplitude')
ax2.stem(freqs, amps)
plt.show()
buildData()
The FFT routine performs a (fast implementation) discrete Fourier transform, which decomposes a time-series signal into a N-length orthonormal basis consisting of the Fourier "roots of unity".
You will get a discrete, single value of the FFT output if and only if you input a signal that is one of the Fourier basis functions (or a phase-rotated version thereof) because it will have a non-zero inner product with one and only one member of the basis set (by definition).
Your first example has 1.5 cycles within the analysis window, so it cannot be a root of unity (one property of the Fourier basis functions is that they have integral cycle counts within the analysis window). Consequently, there is a non-zero "DC offset" (the average over the analysis window is not exactly zero), which will always yield a "DC" term (nonzero Fourier contribution at index 0 corresponding to a DC offset). Because it's a non-integral cycle count within the analysis window, you also get contributions from other frequencies out of the FFT in addition to the dominant contribution from the frequency nearest that of your sinusoid. This is as expected - any sinusoid that is not itself a fourier basis function will have non-zero inner product with multiple fourier basis functions (and hence multiple spectral contributions in the FFT output).
Your 3rd example is just the sum of the two others, so by linearity of the Fourier transform the output of the FFT is simply the sum of the FFTs of the two individual signals. This is also expected: FFT(a+b) = FFT(a) + FFT(b).
A DFT or FFT will only produce a single point result (spike in the graph) from a sinusoid if the frequency is periodic in exactly an integer number of periods within the FFTs length. Otherwise the energy will get spread out among all the other FFT result bins (but mostly in nearby result frequency bins). This is not "messed up" but normal expected behavior for finite length DFTs.
Related
There are many questions on this topic, and I have cycled through a lot of them getting conceptual pointers on handling frequencies (here and here), documentation on numpy functions (here), how-to information on extracting magnitude and phase (here), and stepping outside the site, for example this or this.
However, only the painful "proving it" to myself with simple examples and checking the output of different functions contrasted to their manual implementation has given me a bit of an idea.
The answer attempts to document and share details related to the DFT in Python that may constitute barriers of entry if not explained in simple terms.
The DFT (FFT being its algorithmic computation) is a dot product between a finite discrete number of samples N of an analogue signal s(t) (a function of time or space) and a set of basis vectors of complex exponentials (sin and cos functions). Although the sample is naturally finite and may show no periodicity, it is implicitly thought of as a periodically repeating discrete function. Even when dealing with real-valued signals (the usual situation) it is convenient to work with complex numbers (Euler's equation). It may be intimidating to implement the function on a signal with np.fft.fft(s) only to get the output coefficients in complex numbers and get stuck in their interpretation. Some steps are essential:
What are the frequencies in the complex exponentials?
The DFT does not necessarily preserve the sampling frequency in Hertz. The frequencies are indices (k).
The indices k range from 0 to N - 1 and can be thought of as having units of cycles / set (the set being the N samples of the signal s). I will omit discussing the Nyquist limit, but for real signals the frequencies form a mirror image after N / 2, and given as negative decreasing values after that point (not a problem within the framework of implicit periodicity). The frequencies used in the FFT are not simply k, but k / N, thought of as having units of cycles / sample. See this reference. Example (reference): If a signal is sampled N = 5 times the frequencies are: np.fft.fftfreq(5), yielding [ 0 , 0.2, 0.4, -0.4, -0.2], i.e. [0/5, 1/5, 2/5, -2/5, -1/5].
To convert these frequencies to meaningful units (e.g. Hetz or mm) the values in cycles/sample above will need to be divided by sampling interval T (e.g. distance in seconds between samples). Continuing with the example above, there is a built-in call: np.fft.fftfreq(5, d=T): If the analogue signal s is sampled 5 times at equidistant intervals T = 1/2 sec for a total sample of NT = 5 x 1/2 sec, the normalized frequencies will be np.fft.fftfreq(5, d = 1/2), yielding [0 0.4 0.8 -0.8 -0.4] or [0/NT, 1/NT, 2/NT, -2/NT, -1/NT].
Either normalized or un-normalized frequencies are used to control angular frequencies (ω_m), expressed as ω_m = 2π k/NT. Note that NT is the total duration for
which the signal was sampled. The index k does result in multiples of a fundamental frequency (ω-naught) corresponding to k = 1 - the frequency of (co-)sine wave that completes
exactly one oscillation over NT (here).
Magnitude, frequency and phase of the coefficients in the FFT
Given the output of the FFT S = fft.fft(s), the magnitude of the output coefficients (here) is just the Euclidean norm of the complex numbers in the output coefficients adjusted for the symmetry in real signals (x 2) and for the number of samples 1/N: magnitudes = 1/N * np.abs(S)
The frequencies are matched to the call explained above np.fft.fftfreq(N), or more expediently to incorporate the actual analogue frequency units, frequencies = np.fft.fftfreq(N, d=T).
The phase of each coefficients is the angle of the complex number in polar form phase = np.arctan(np.imag(S)/np.real(S))
How to find the dominant frequencies in the signal s in the FFT and their coefficients?
Plotting aside, finding the index k corresponding the frequency with the highest magnitude can be accomplished as index = np.argmax(np.abs(S)). To find the 4 indices with the highest magnitude, for example, the call is indices = np.argpartition(S,-4)[-4:].
And finding the actual corresponding coefficient: S[index] with frequency freq_max = np.fft.fftfreq(N, d=T)[index].
Reproducing the original signal after obtaining the coefficients:
Reproducing s through sines and cosines (p.150 in here):
Re = np.real(S[index])
Im = np.imag(S[index])
s_recon = Re * 2/N * np.cos(-2 * np.pi * freq_max * t) + abs(Im) * 2/N * np.sin(-2 * np.pi * freq_max * t)
Here is a complete example:
import numpy as np
import matplotlib.pyplot as plt
N = 10000 # Sample points
T = 1/5000 # Spacing
# Total duration N * T= 2
t = np.linspace(0.0, N*T, N, endpoint=False) # Time: Vector of 10,000 elements from 0 to N*T=2.
frequency = np.fft.fftfreq(t.size, d=T) # Normalized Fourier frequencies in spectrum.
f0 = 25 # Frequency of the sampled wave
phi = np.pi/8 # Phase
A = 50 # Amplitude
s = A * np.cos(2 * np.pi * f0 * t + phi) # Signal
S = np.fft.fft(s) # Unnormalized FFT
index = np.argmax(np.abs(S))
print(S[index])
magnitude = np.abs(S[index]) * 2/N
freq_max = frequency[index]
phase = np.arctan(np.imag(S[index])/np.real(S[index]))
print(f"magnitude: {magnitude}, freq_max: {freq_max}, phase: {phase}")
print(phi)
fig, [ax1,ax2] = plt.subplots(nrows=2, ncols=1, figsize=(10, 5))
ax1.plot(t,s, linewidth=0.5, linestyle='-', color='r', marker='o', markersize=1,markerfacecolor=(1, 0, 0, 0.1))
ax1.set_xlim([0, .31])
ax1.set_ylim([-51,51])
ax2.plot(frequency[0:N//2], 2/N * np.abs(S[0:N//2]), '.', color='xkcd:lightish blue', label='amplitude spectrum')
plt.xlim([0, 100])
plt.show()
Re = np.real(S[index])
Im = np.imag(S[index])
s_recon = Re*2/N * np.cos(-2 * np.pi * freq_max * t) + abs(Im)*2/N * np.sin(-2 * np.pi * freq_max * t)
fig = plt.figure(figsize=(10, 2.5))
plt.xlim(0,0.3)
plt.ylim(-51,51)
plt.plot(t,s_recon, linewidth=0.5, linestyle='-', color='r', marker='o', markersize=1,markerfacecolor=(1, 0, 0, 0.1))
plt.show()
s.all() == s_recon.all()
There are many questions on this topic, and I have cycled through a lot of them getting conceptual pointers on handling frequencies (here and here), documentation on numpy functions (here), how-to information on extracting magnitude and phase (here), and stepping outside the site, for example this or this.
However, only the painful "proving it" to myself with simple examples and checking the output of different functions contrasted to their manual implementation has given me a bit of an idea.
The answer attempts to document and share details related to the DFT in Python that may constitute barriers of entry if not explained in simple terms.
The DFT (FFT being its algorithmic computation) is a dot product between a finite discrete number of samples N of an analogue signal s(t) (a function of time or space) and a set of basis vectors of complex exponentials (sin and cos functions). Although the sample is naturally finite and may show no periodicity, it is implicitly thought of as a periodically repeating discrete function. Even when dealing with real-valued signals (the usual situation) it is convenient to work with complex numbers (Euler's equation). It may be intimidating to implement the function on a signal with np.fft.fft(s) only to get the output coefficients in complex numbers and get stuck in their interpretation. Some steps are essential:
What are the frequencies in the complex exponentials?
The DFT does not necessarily preserve the sampling frequency in Hertz. The frequencies are indices (k).
The indices k range from 0 to N - 1 and can be thought of as having units of cycles / set (the set being the N samples of the signal s). I will omit discussing the Nyquist limit, but for real signals the frequencies form a mirror image after N / 2, and given as negative decreasing values after that point (not a problem within the framework of implicit periodicity). The frequencies used in the FFT are not simply k, but k / N, thought of as having units of cycles / sample. See this reference. Example (reference): If a signal is sampled N = 5 times the frequencies are: np.fft.fftfreq(5), yielding [ 0 , 0.2, 0.4, -0.4, -0.2], i.e. [0/5, 1/5, 2/5, -2/5, -1/5].
To convert these frequencies to meaningful units (e.g. Hetz or mm) the values in cycles/sample above will need to be divided by sampling interval T (e.g. distance in seconds between samples). Continuing with the example above, there is a built-in call: np.fft.fftfreq(5, d=T): If the analogue signal s is sampled 5 times at equidistant intervals T = 1/2 sec for a total sample of NT = 5 x 1/2 sec, the normalized frequencies will be np.fft.fftfreq(5, d = 1/2), yielding [0 0.4 0.8 -0.8 -0.4] or [0/NT, 1/NT, 2/NT, -2/NT, -1/NT].
Either normalized or un-normalized frequencies are used to control angular frequencies (ω_m), expressed as ω_m = 2π k/NT. Note that NT is the total duration for
which the signal was sampled. The index k does result in multiples of a fundamental frequency (ω-naught) corresponding to k = 1 - the frequency of (co-)sine wave that completes
exactly one oscillation over NT (here).
Magnitude, frequency and phase of the coefficients in the FFT
Given the output of the FFT S = fft.fft(s), the magnitude of the output coefficients (here) is just the Euclidean norm of the complex numbers in the output coefficients adjusted for the symmetry in real signals (x 2) and for the number of samples 1/N: magnitudes = 1/N * np.abs(S)
The frequencies are matched to the call explained above np.fft.fftfreq(N), or more expediently to incorporate the actual analogue frequency units, frequencies = np.fft.fftfreq(N, d=T).
The phase of each coefficients is the angle of the complex number in polar form phase = np.arctan(np.imag(S)/np.real(S))
How to find the dominant frequencies in the signal s in the FFT and their coefficients?
Plotting aside, finding the index k corresponding the frequency with the highest magnitude can be accomplished as index = np.argmax(np.abs(S)). To find the 4 indices with the highest magnitude, for example, the call is indices = np.argpartition(S,-4)[-4:].
And finding the actual corresponding coefficient: S[index] with frequency freq_max = np.fft.fftfreq(N, d=T)[index].
Reproducing the original signal after obtaining the coefficients:
Reproducing s through sines and cosines (p.150 in here):
Re = np.real(S[index])
Im = np.imag(S[index])
s_recon = Re * 2/N * np.cos(-2 * np.pi * freq_max * t) + abs(Im) * 2/N * np.sin(-2 * np.pi * freq_max * t)
Here is a complete example:
import numpy as np
import matplotlib.pyplot as plt
N = 10000 # Sample points
T = 1/5000 # Spacing
# Total duration N * T= 2
t = np.linspace(0.0, N*T, N, endpoint=False) # Time: Vector of 10,000 elements from 0 to N*T=2.
frequency = np.fft.fftfreq(t.size, d=T) # Normalized Fourier frequencies in spectrum.
f0 = 25 # Frequency of the sampled wave
phi = np.pi/8 # Phase
A = 50 # Amplitude
s = A * np.cos(2 * np.pi * f0 * t + phi) # Signal
S = np.fft.fft(s) # Unnormalized FFT
index = np.argmax(np.abs(S))
print(S[index])
magnitude = np.abs(S[index]) * 2/N
freq_max = frequency[index]
phase = np.arctan(np.imag(S[index])/np.real(S[index]))
print(f"magnitude: {magnitude}, freq_max: {freq_max}, phase: {phase}")
print(phi)
fig, [ax1,ax2] = plt.subplots(nrows=2, ncols=1, figsize=(10, 5))
ax1.plot(t,s, linewidth=0.5, linestyle='-', color='r', marker='o', markersize=1,markerfacecolor=(1, 0, 0, 0.1))
ax1.set_xlim([0, .31])
ax1.set_ylim([-51,51])
ax2.plot(frequency[0:N//2], 2/N * np.abs(S[0:N//2]), '.', color='xkcd:lightish blue', label='amplitude spectrum')
plt.xlim([0, 100])
plt.show()
Re = np.real(S[index])
Im = np.imag(S[index])
s_recon = Re*2/N * np.cos(-2 * np.pi * freq_max * t) + abs(Im)*2/N * np.sin(-2 * np.pi * freq_max * t)
fig = plt.figure(figsize=(10, 2.5))
plt.xlim(0,0.3)
plt.ylim(-51,51)
plt.plot(t,s_recon, linewidth=0.5, linestyle='-', color='r', marker='o', markersize=1,markerfacecolor=(1, 0, 0, 0.1))
plt.show()
s.all() == s_recon.all()
So, I am trying to figure out how to use DFT in practice to detect prevalent frequencies in a signal. I have been trying to wrap my head around what Fourier transforms are and how DFT algorithms work, but apparently I still have ways to go. I have written some code to generate a signal (since the intent is to work with music, I generated a major C chord, hence the weird frequency values) and then tried to work back to the frequency numbers. Here is the code I have
sr = 44100 # sample rate
x = np.linspace(0, 1, sr) # one second of signal
tpi = 2 * np.pi
data = np.sin(261.63 * tpi * x) + np.sin(329.63 * tpi * x) + np.sin(392.00 * tpi * x)
freqs = np.fft.fftfreq(sr)
fft = np.fft.fft(data)
idx = np.argsort(np.abs(fft))
fft = fft[idx]
freqs = freqs[idx]
print(freqs[-6:] * sr)
This gives me [-262. 262. -330. 330. -392. 392.]
which is different from the frequencies I encoded (261.63, 329.63 and 392.0). What am I doing wrong and how do I fix it?
Indeed, if the frame lasts T seconds, the frequencies of the DFT are k/T Hz, where k is an integer. As a consequence, oversampling does not improve the accuracy of the estimated frequency, as long as these frequencies are identifed as maxima of the magnitude of the DFT. On the contrary, considering longer frames lasting 100s would induce a spacing between the DFT frequencies of 0.01Hz, which might be good enough to produce the expected frequency. It is possible to due much better, by estimating the frequency of a peak as its mean frequency wih respect to power density.
Figure 1: even after applying a Tuckey window, the DFT of the windowed signal is not a sum of Dirac: there is still some spectral leakage at the bottom of the peaks. This power must be accounted for as the frequencies are estimated.
Another issue is that the length of the frame is not a multiple of the period of the signal, which may not be periodic anyway. Nevertheless, the DFT is computed as if the signal were periodic but discontinuous at the edge of the frame. It induce spurous frequencies described as spectral leakage. Windowing is the reference method to deal with such problems and mitigate the problem related to the artificial discontinuity. Indeed, the value of a window continuously decrease to zero near the edges of the frame. There is a list of window functions and a lot of window functions are available in scipy.signal. A window is applied as:
tuckey_window=signal.tukey(len(data),0.5,True)
data=data*tuckey_window
At that point, the frequencies exibiting the largest magnitude still are 262, 330 and 392. Applying a window only makes the peaks more visible: the DFT of the windowed signal features three distinguished peaks, each featuring a central lobe and side lobes, depending on the DFT of the window. The lobes of these windows are symmetric: the central frequency can therefore be computed as the mean frequency of the peak, with respect to power density.
import numpy as np
from scipy import signal
import scipy
sr = 44100 # sample rate
x = np.linspace(0, 1, sr) # one second of signal
tpi = 2 * np.pi
data = np.sin(261.63 * tpi * x) + np.sin(329.63 * tpi * x) + np.sin(392.00 * tpi * x)
#a window...
tuckey_window=signal.tukey(len(data),0.5,True)
data=data*tuckey_window
data -= np.mean(data)
fft = np.fft.rfft(data, norm="ortho")
def abs2(x):
return x.real**2 + x.imag**2
fftmag=abs2(fft)[:1000]
peaks, _= signal.find_peaks(fftmag, height=np.max(fftmag)*0.1)
print "potential frequencies ", peaks
#compute the mean frequency of the peak with respect to power density
powerpeak=np.zeros(len(peaks))
powerpeaktimefrequency=np.zeros(len(peaks))
for i in range(1000):
dist=1000
jnear=0
for j in range(len(peaks)):
if dist>np.abs(i-peaks[j]):
dist=np.abs(i-peaks[j])
jnear=j
powerpeak[jnear]+=fftmag[i]
powerpeaktimefrequency[jnear]+=fftmag[i]*i
powerpeaktimefrequency=np.divide(powerpeaktimefrequency,powerpeak)
print 'corrected frequencies', powerpeaktimefrequency
The resulting estimated frequencies are 261.6359 Hz, 329.637Hz and 392.0088 Hz: it much better than 262, 330 and 392Hz and it satisfies the required 0.01Hz accuracy for such a pure noiseless input signal.
DFT result bins are separated by Fs/N in frequency, where N is the length of the FFT. Thus, the duration of your DFT window limits the resolution in terms of DFT result bin frequency center spacings.
But, for well separated frequency peaks in low noise (high S/N), instead of increasing the duration of the data, you can instead estimate the frequency peak locations to a higher resolution by interpolating the DFT result between the DFT result bins. You can try parabolic interpolation for a coarse frequency peak location estimate, but windowed Sinc interpolation (essentially Shannon-Whittaker reconstruction) would provide far better frequency estimation accuracy and resolution (given a low enough noise floor around the frequency peak(s) of interest, e.g. no nearby sinusoids in your artificial waveform case).
Since you want to get a resolution of 0.01 Hz, you will need to sample at least 100 sec worth of data. You will be able to resolve frequencies up to about 22.05 kHz.
Is there a way to generate a quasi periodic signal (a signal with a specific frequency distribution, like a normal distribution)? In addition,
the signal should not have a stationary frequency distribution since the inverse Fourier transform of a Gaussian function is still a Gaussian function, while what I want is an oscillating signal.
I used a discrete series of Normally distributed frequencies to generate the signal, that is
The frequencies distribute like this:
So with initial phases
, I got the signal
However, the signal is like
and its FFT spectrum is like
.
I found that the final spectrum is only similar to a Gaussian function within a short time period since t=0 (corresponding to the left few peaks in figure4 which are extremely high), and the rest of the signal only contributed to the glitches on both sides of the peak in figure5.
I thought the problem may have come from the initial phases. I tried randomly distributed initial phases but it also didn't work.
So, what is the right way to generate such a signal?
Here is my python code:
import numpy as np
from scipy.special import erf, erfinv
def gaussian_frequency(array_length = 10000, central_freq = 100, std = 10):
n = np.arange(array_length)
f = np.sqrt(2)*std*erfinv(2*n/array_length - erf(central_freq/np.sqrt(2)/std)) + central_freq
return f
f = gaussian_frequency()
phi = np.linspace(0,2*np.pi, len(f))
t = np.linspace(0,100,100000)
signal = np.zeros(len(t))
for k in range(len(f)):
signal += np.sin(phi[k] + 2*np.pi*f[k]*t)
def fourierPlt(signal, TIMESTEP = .001):
num_samples = len(signal)
k = np.arange(num_samples)
Fs = 1/TIMESTEP
T = num_samples/Fs
frq = k/T # two sides frequency range
frq = frq[range(int(num_samples/2))] # one side frequency range
fourier = np.fft.fft(signal)/num_samples # fft computing and normalization
fourier = abs(fourier[range(int(num_samples/2))])
fourier = fourier/sum(fourier)
plt.plot(frq, fourier, 'r', linewidth = 1)
plt.title("Fast Fourier Transform")
plt.xlabel('$f$/Hz')
plt.ylabel('Normalized Spectrum')
return(frq, fourier)
fourierPlt(signal)
If you want your signal to be real-valued, you need to mirror the frequency component: you need the positive and negative frequencies to be complex conjugates of each other. I presume you thought of this.
A Gaussian-shaped frequency spectrum (with mean at f=0) yields a Gaussian-shaped signal.
Shifting the frequency spectrum by a frequency f0 causes the time-domain signal to be multiplied by exp(j 2 π f0 t). That is, you only change its phase.
Assuming you still want a real-valued time signal, you'll have to duplicate the frequency spectrum and shift it in both directions. This causes a multiplication by
exp(j 2 π f0 t)+exp(-j 2 π f0 t) = 2 cos(2 π f0 t) .
Thus, your signal is a Gaussian modulating a cosine.
I'm using MATLAB here for the example, I hope you can easily translate this to Python:
t=0:300;
s=exp(-(t-150).^2/30.^2) .* cos(2*pi*0.1*t);
subplot(2,1,1)
plot(t,s)
xlabel('time')
S=abs(fftshift(fft(s)));
f=linspace(-0.5,0.5,length(S));
subplot(2,1,2)
plot(f,S)
xlabel('frequency')
For those interested in image processing: the Gabor filter is exactly this, but with the frequency spectrum shifted only one direction. The resulting filter is complex, the magnitude of the filtering result is used. This leads to a phase-independent filter.
I am trying to implement Periodogram in Python based on the description from Bartlett's method, and compared the result with those from Scipy, by setting overlap=0, use window='boxcar' (rectangle window). However, my result is off by some scale factor. Can someone points out what was wrong with my code? Thanks
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
def my_bartlett_periodogram(x, fs, nperseg, nfft):
nsegments = len(x) // nperseg
psd = np.zeros(nfft)
for segment in x.reshape(nsegments, nperseg):
psd += np.abs(np.fft.fft(segment))**2 / nfft
psd[0] = 0 # important!!
psd /= nsegments
psd = psd[0 : nfft//2]
freq = np.linspace(0, fs/2, nfft//2)
return freq, psd
def plot_output(t, x, f1, psd1, f2, psd2):
fig, axs = plt.subplots(3,1, figsize=(12,15))
axs[0].plot(t[:300], x[:300])
axs[1].plot(freq1, psd1)
axs[2].plot(freq2, psd2)
axs[0].set_title('Input (len=8192, fs=512)')
axs[1].set_title('Bartlett Periodogram (nfft=512, zero-overlap, no-window)')
axs[2].set_title('Scipy Periodogram (nfft=512, zero-overlap, no-window)')
axs[0].set_xticks([])
axs[2].set_xlabel('Freq (Hz)')
plt.show()
# Run
fs = nfft = nperseg = 512
t = np.arange(8192) / fs
x = np.sin(2*np.pi*50*t) + np.sin(2*np.pi*100*t) + np.sin(2*np.pi*150*t)
freq1, psd1 = my_bartlett_periodogram(x, fs, nperseg, nfft)
freq2, psd2 = signal.welch(x, fs, nperseg=nperseg, nfft=nfft, window='boxcar', noverlap=0)
plot_output(t, x, freq1, psd1, freq2, psd2)
TL;DR:
Nothing wrong with the code. But welch returns the power spectral density, which is the power spectrum times fs and it compensates for cutting away half the spectrum by multiplying with 2.
To compensate, psd2 * fs / 2 should be very similar to psd.
According to Wikipedia the calculation of psd seems correct:
The original N point data segment is split up into K (non-overlapping) data segments, each of length M
For each segment, compute the periodogram by computing the discrete Fourier transform (DFT version which does not divide by M), then computing the squared magnitude of the result and dividing this by M.
Average the result of the periodograms above for the K data segments.
So whom shall we trust more, Wikipedia or scipy? I would tend towards the latter, but we can find out for ourselves. According to Parseval's theorem the integral over the squared signal should be the same as the integral over the sqared FFT magnitude. Since the Periodogram is obtained from the squared FFT the theorem should hold approximately.
print(np.mean(y**2)) # 1.499727698431174
print(np.mean(psd)) # (1.4999999999999991+0j)
print(np.mean(psd2)) # 0.0058365758754863788
That's close enough for psd, so let's assume it's correct. But I refuse to believe that scipy should be so blatantly wrong! Let's take a closer look at the documentation and see what they have to say about the scaling argument (emphasis mine):
Selects between computing the power spectral density (‘density’) where Pxx has units of V**2/Hz and computing the power spectrum (‘spectrum’) where Pxx has units of V**2, if x is measured in V and fs is measured in Hz. Defaults to ‘density’
Uh-huh! welch's result is the power spectral density, which means it has units of Power per Hz. However, we compared it against the signal power. If we multiply psd2 with the sampling rate to get rid of the 1/Hz units it's the same as psd. Well, except for a factor 2. This factor is meant to compensate for cutting away half the spectrum. If we set return_onesided=False to get the full spectrum that factor is gone.