I am trying to implement Periodogram in Python based on the description from Bartlett's method, and compared the result with those from Scipy, by setting overlap=0, use window='boxcar' (rectangle window). However, my result is off by some scale factor. Can someone points out what was wrong with my code? Thanks
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
def my_bartlett_periodogram(x, fs, nperseg, nfft):
nsegments = len(x) // nperseg
psd = np.zeros(nfft)
for segment in x.reshape(nsegments, nperseg):
psd += np.abs(np.fft.fft(segment))**2 / nfft
psd[0] = 0 # important!!
psd /= nsegments
psd = psd[0 : nfft//2]
freq = np.linspace(0, fs/2, nfft//2)
return freq, psd
def plot_output(t, x, f1, psd1, f2, psd2):
fig, axs = plt.subplots(3,1, figsize=(12,15))
axs[0].plot(t[:300], x[:300])
axs[1].plot(freq1, psd1)
axs[2].plot(freq2, psd2)
axs[0].set_title('Input (len=8192, fs=512)')
axs[1].set_title('Bartlett Periodogram (nfft=512, zero-overlap, no-window)')
axs[2].set_title('Scipy Periodogram (nfft=512, zero-overlap, no-window)')
axs[0].set_xticks([])
axs[2].set_xlabel('Freq (Hz)')
plt.show()
# Run
fs = nfft = nperseg = 512
t = np.arange(8192) / fs
x = np.sin(2*np.pi*50*t) + np.sin(2*np.pi*100*t) + np.sin(2*np.pi*150*t)
freq1, psd1 = my_bartlett_periodogram(x, fs, nperseg, nfft)
freq2, psd2 = signal.welch(x, fs, nperseg=nperseg, nfft=nfft, window='boxcar', noverlap=0)
plot_output(t, x, freq1, psd1, freq2, psd2)
TL;DR:
Nothing wrong with the code. But welch returns the power spectral density, which is the power spectrum times fs and it compensates for cutting away half the spectrum by multiplying with 2.
To compensate, psd2 * fs / 2 should be very similar to psd.
According to Wikipedia the calculation of psd seems correct:
The original N point data segment is split up into K (non-overlapping) data segments, each of length M
For each segment, compute the periodogram by computing the discrete Fourier transform (DFT version which does not divide by M), then computing the squared magnitude of the result and dividing this by M.
Average the result of the periodograms above for the K data segments.
So whom shall we trust more, Wikipedia or scipy? I would tend towards the latter, but we can find out for ourselves. According to Parseval's theorem the integral over the squared signal should be the same as the integral over the sqared FFT magnitude. Since the Periodogram is obtained from the squared FFT the theorem should hold approximately.
print(np.mean(y**2)) # 1.499727698431174
print(np.mean(psd)) # (1.4999999999999991+0j)
print(np.mean(psd2)) # 0.0058365758754863788
That's close enough for psd, so let's assume it's correct. But I refuse to believe that scipy should be so blatantly wrong! Let's take a closer look at the documentation and see what they have to say about the scaling argument (emphasis mine):
Selects between computing the power spectral density (‘density’) where Pxx has units of V**2/Hz and computing the power spectrum (‘spectrum’) where Pxx has units of V**2, if x is measured in V and fs is measured in Hz. Defaults to ‘density’
Uh-huh! welch's result is the power spectral density, which means it has units of Power per Hz. However, we compared it against the signal power. If we multiply psd2 with the sampling rate to get rid of the 1/Hz units it's the same as psd. Well, except for a factor 2. This factor is meant to compensate for cutting away half the spectrum. If we set return_onesided=False to get the full spectrum that factor is gone.
Related
I am struggling with the correct normalization of the power spectral density (and its inverse).
I am given a real problem, let's say the readings of an accelerometer in the form of the power spectral density (psd) in units of Amplitude^2/Hz. I would like to translate this back into a randomized time series. However, first I want to understand the "forward" direction, time series to PSD.
According to [1], the PSD of a time series x(t) can be calculated by:
PSD(w) = 1/T * abs(F(w))^2 = df * abs(F(w))^2
in which T is the sampling time of x(t) and F(w) is the Fourier transform of x(t) and df=1/T is the frequency resolution in the Fourier space. However, the results I am getting are not equal to what I am getting using the scipy Welch method, see code below.
This first block of code is taken from the scipy.welch documentary:
from scipy import signal
import matplotlib.pyplot as plt
fs = 10e3
N = 1e5
amp = 2*np.sqrt(2)
freq = 1234.0
noise_power = 0.001 * fs / 2
time = np.arange(N) / fs
x = amp*np.sin(2*np.pi*freq*time)
x += np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
f, Pxx_den = signal.welch(x, fs, nperseg=1024)
plt.semilogy(f, Pxx_den)
plt.ylim(\[0.5e-3, 1\])
plt.xlabel('frequency \[Hz\]')
plt.ylabel('PSD \[V**2/Hz\]')
plt.show()
First thing I noticed is that the plotted psd changes with the variable fs which seems strange to me. (Maybe I need to adjust the nperseg argument then accordingly? Why is nperseg not set to fs automatically then?)
My code would be the following: (Note that I defined my own fft_full function which already takes care of the correct fourier transform normalization, which I verified by checking Parsevals theorem).
import scipy.fftpack as fftpack
def fft_full(xt,yt):
dt = xt[1] - xt[0]
x_fft=fftpack.fftfreq(xt.size,dt)
y_fft=fftpack.fft(yt)*dt
return (x_fft,y_fft)
xf,yf=fft_full(time,x)
df=xf[1] - xf[0]
psd=np.abs(yf)**2 *df
plt.figure()
plt.semilogy(xf, psd)
#plt.ylim([0.5e-3, 1])
plt.xlim(0,)
plt.xlabel('frequency [Hz]')
plt.ylabel('PSD [V**2/Hz]')
plt.show()
Unfortunately, I am not yet allowed to post images but the two plots do not look the same!
I would greatly appreciate if someone could explain to me where I went wrong and settle this once and for all :)
[1]: Eq. 2.82. Random Vibrations in Spacecraft Structures Design
Theory and Applications, Authors: Wijker, J. Jaap, 2009
The scipy library uses the Welch's method to estimate a PSD. This method is more complex than just taking the squared modulus of the discrete Fourier transform. In short terms, it proceeds as follows:
Let x be the input discrete signal that contains N samples.
Split x into M overlapping segments, such that each segment sm contains nperseg samples and that each two consecutive segments overlap in noverlap samples, so that nperseg = K * (nperseg - noverlap), where K is an integer (usually K = 2). Note also that:
N = nperseg + (M - 1) * (nperseg - noverlap) = (M + K - 1) * nperseg / K
From each segment sm, subtract its mean (this removes the DC component):
tm = sm - sum(sm) / nperseg
Multiply the elements of the obtained zero-mean segments tm by the elements of a suitable (nonsymmetric) window function, h (such as the Hann window):
um = tm * h
Calculate the Fast Fourier Transform of all vectors um. Before performing these transformations, we usually first append so many zeros to each vector um that its new dimension becomes a power of 2 (the nfft argument of the function welch is used for this purpose). Let us suppose that len(um) = 2p. In most cases, our input vectors are real-valued, so it is best to apply FFT for real data. Its results are then complex-valued vectors vm = rfft(um), such that len(vm) = 2p - 1 + 1.
Calculate the squared modulus of all transformed vectors:
am = abs(vm) ** 2,
or more efficiently:
am = vm.real ** 2 + vm.imag ** 2
Normalize the vectors am as follows:
bm = am / sum(h * h)
bm[1:-1] *= 2 (this takes into account the negative frequencies),
where h is a real vector of the dimension nperseg that contains the window coefficients. In case of the Hann window, we can prove that
sum(h * h) = 3 / 8 * len(h) = 3 / 8 * nperseg
Estimate the PSD as the mean of all vectors bm:
psd = sum(bm) / M
The result is a vector of the dimension len(psd) = 2p - 1 + 1. If we wish that the sum of all psd coefficients matches the mean squared amplitude of the windowed input data (rather than the sum of squared amplitudes), then the vector psd must also be divided by nperseg. However, the scipy routine omits this step. In any case, we usually present psd on the decibel scale, so that the final result is:
psd_dB = 10 * log10(psd).
For a more detailed description, please read the original Welch's paper. See also Wikipedia's page and chapter 13.4 of Numerical Recipes in C
I tried to create a spectogram of magnitudes using scipy.signal.spectogram.
Unfortunately I didn't get it working.
My test signal should be a sine with frequency 400 Hz and an amplitude of 1. The result for the magnitude of the spectogram seems to be 0.5 instead of 1.0. I have no idea what the problem could be.
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
# 2s time range with 44kHz
t = np.arange(0, 2, 1/44000)
# test signal: sine with 400Hz amplitude 1
x = np.sin(t*2*np.pi*440)
# spectogram for spectrum of magnitudes
f, t, Sxx = signal.spectrogram(x,
44000,
"hanning",
nperseg=1000,
noverlap=0,
scaling="spectrum",
return_onesided=True,
mode="magnitude"
)
# plot last frequency plot
plt.plot(f, Sxx[:,-1])
print("highest magnitude is: %f" %np.max(Sxx))
A strictly real time domain signal is conjugate symmetric in the frequency domain. e.g. will appear in both the positive and negative (or upper) half of a complex result FFT.
Thus you need to add together the two "halves" together of an FFT result to get the total energy (Parseval's theorem). Or just double one side, since complex conjugates have equal magnitudes.
So, I am trying to figure out how to use DFT in practice to detect prevalent frequencies in a signal. I have been trying to wrap my head around what Fourier transforms are and how DFT algorithms work, but apparently I still have ways to go. I have written some code to generate a signal (since the intent is to work with music, I generated a major C chord, hence the weird frequency values) and then tried to work back to the frequency numbers. Here is the code I have
sr = 44100 # sample rate
x = np.linspace(0, 1, sr) # one second of signal
tpi = 2 * np.pi
data = np.sin(261.63 * tpi * x) + np.sin(329.63 * tpi * x) + np.sin(392.00 * tpi * x)
freqs = np.fft.fftfreq(sr)
fft = np.fft.fft(data)
idx = np.argsort(np.abs(fft))
fft = fft[idx]
freqs = freqs[idx]
print(freqs[-6:] * sr)
This gives me [-262. 262. -330. 330. -392. 392.]
which is different from the frequencies I encoded (261.63, 329.63 and 392.0). What am I doing wrong and how do I fix it?
Indeed, if the frame lasts T seconds, the frequencies of the DFT are k/T Hz, where k is an integer. As a consequence, oversampling does not improve the accuracy of the estimated frequency, as long as these frequencies are identifed as maxima of the magnitude of the DFT. On the contrary, considering longer frames lasting 100s would induce a spacing between the DFT frequencies of 0.01Hz, which might be good enough to produce the expected frequency. It is possible to due much better, by estimating the frequency of a peak as its mean frequency wih respect to power density.
Figure 1: even after applying a Tuckey window, the DFT of the windowed signal is not a sum of Dirac: there is still some spectral leakage at the bottom of the peaks. This power must be accounted for as the frequencies are estimated.
Another issue is that the length of the frame is not a multiple of the period of the signal, which may not be periodic anyway. Nevertheless, the DFT is computed as if the signal were periodic but discontinuous at the edge of the frame. It induce spurous frequencies described as spectral leakage. Windowing is the reference method to deal with such problems and mitigate the problem related to the artificial discontinuity. Indeed, the value of a window continuously decrease to zero near the edges of the frame. There is a list of window functions and a lot of window functions are available in scipy.signal. A window is applied as:
tuckey_window=signal.tukey(len(data),0.5,True)
data=data*tuckey_window
At that point, the frequencies exibiting the largest magnitude still are 262, 330 and 392. Applying a window only makes the peaks more visible: the DFT of the windowed signal features three distinguished peaks, each featuring a central lobe and side lobes, depending on the DFT of the window. The lobes of these windows are symmetric: the central frequency can therefore be computed as the mean frequency of the peak, with respect to power density.
import numpy as np
from scipy import signal
import scipy
sr = 44100 # sample rate
x = np.linspace(0, 1, sr) # one second of signal
tpi = 2 * np.pi
data = np.sin(261.63 * tpi * x) + np.sin(329.63 * tpi * x) + np.sin(392.00 * tpi * x)
#a window...
tuckey_window=signal.tukey(len(data),0.5,True)
data=data*tuckey_window
data -= np.mean(data)
fft = np.fft.rfft(data, norm="ortho")
def abs2(x):
return x.real**2 + x.imag**2
fftmag=abs2(fft)[:1000]
peaks, _= signal.find_peaks(fftmag, height=np.max(fftmag)*0.1)
print "potential frequencies ", peaks
#compute the mean frequency of the peak with respect to power density
powerpeak=np.zeros(len(peaks))
powerpeaktimefrequency=np.zeros(len(peaks))
for i in range(1000):
dist=1000
jnear=0
for j in range(len(peaks)):
if dist>np.abs(i-peaks[j]):
dist=np.abs(i-peaks[j])
jnear=j
powerpeak[jnear]+=fftmag[i]
powerpeaktimefrequency[jnear]+=fftmag[i]*i
powerpeaktimefrequency=np.divide(powerpeaktimefrequency,powerpeak)
print 'corrected frequencies', powerpeaktimefrequency
The resulting estimated frequencies are 261.6359 Hz, 329.637Hz and 392.0088 Hz: it much better than 262, 330 and 392Hz and it satisfies the required 0.01Hz accuracy for such a pure noiseless input signal.
DFT result bins are separated by Fs/N in frequency, where N is the length of the FFT. Thus, the duration of your DFT window limits the resolution in terms of DFT result bin frequency center spacings.
But, for well separated frequency peaks in low noise (high S/N), instead of increasing the duration of the data, you can instead estimate the frequency peak locations to a higher resolution by interpolating the DFT result between the DFT result bins. You can try parabolic interpolation for a coarse frequency peak location estimate, but windowed Sinc interpolation (essentially Shannon-Whittaker reconstruction) would provide far better frequency estimation accuracy and resolution (given a low enough noise floor around the frequency peak(s) of interest, e.g. no nearby sinusoids in your artificial waveform case).
Since you want to get a resolution of 0.01 Hz, you will need to sample at least 100 sec worth of data. You will be able to resolve frequencies up to about 22.05 kHz.
I am trying to create an amplitude vs frequency spectrogram of an audio file in Python. what is the procedure to do so?
Some sample code would be of great help.
Simple spectrum
The simplest way to get an amplitude vs. frequency relationship for an evenly sampled signal x is to compute its Discrete Fourier Transform through the efficient Fast Fourier Transform algorithm. Given a signal x sampled at a regular sampling rate fs, you could do this with:
import numpy as np
Xf_mag = np.abs(np.fft.fft(x))
Each index of the Xf_mag array will then contain the amplitude of a frequency bin whose frequency is given by index * fs/len(Xf_mag). These frequencies can be conveniently computed using:
freqs = np.fft.fftfreq(len(Xf_mag), d=1.0/fs)
Finally the spectrum could be plotted using matplotlib:
import matplotlib.pyplot as plt
plt.plot(freqs, Xf_mag)
Refining the spectrum estimation
You might notice that the spectrum obtained with the simple FFT approach yields a spectrum which appears very noisy (ie. with lots of spikes).
To get a more accurate estimate, a more sophisticated approach would be to compute a power spectrum estimate using techniques such as periodograms (implemented by scipy.signal.periodogram) and Welch's method (implemented by scipy.signal.welch). Note however that in these cases the computed spectrum is proportional to the square of the amplitudes, so that its square root provide an estimate of the Root-Mean-Squared (RMS) amplitudes.
Going back to the signal x sampled at a regular sampling rate fs, such a power spectrum estimate could thus be obtained as described in the samples from scipy's documentation with the following:
f, Pxx = signal.periodogram(x, fs)
A_rms = np.sqrt(Pxx)
The corresponding frequencies f are also calculated in the process, so you could then plot the result with
plt.plot(f, A_rms)
Using scipy.signal.welch is quite similar, but uses a slightly different implementation which provides a different accuracy/resolution tradeoff.
from scipy import signal
import matplotlib.pyplot as plt
fs = 10e3
N = 1e5
amp = 2 * np.sqrt(2)
noise_power = 0.01 * fs / 2
time = np.arange(N) / float(fs)
mod = 500*np.cos(2*np.pi*0.25*time)
carrier = amp * np.sin(2*np.pi*3e3*time + mod)
noise = np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
noise *= np.exp(-time/5)
x = carrier + noise
f, t, Sxx = signal.spectrogram(x, fs)
plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
This is pulled from the scipy documentation as you will need scientific computing to create a spectrogram.
install scipy on your machine if you do not have it already and read its documentation:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.spectrogram.html
I have an arbitrary signal and I need to know the frequency spectrum of the signal, which I obtain by doing an FFT. The issue is, I need lots of resolution only around this one particular frequency. The issue is, if I increase my window width, or if I up the sample rate, it goes too slow and I end up with a lot of detail everywhere. I only want a lot of detail in one point, and minimal detail everywhere else.
I tried using a Goertzel filter around just the area I need, and then FFT everywhere else, but that didn't get me any more resolution, which I suppose was to be expected.
Any ideas? My only idea at the moment is to sweep and innerproduct around the value I want.
Thanks.
Increasing the sample rate will not give you a higher spectral resolution, it will only give you more high-frequency information, which you are not interested in. The only way to increase spectral resolution is to increase the window length. There is a way to increase the length of your window artificially by zero-padding, but this only gives you 'fake resolution', it will just yield a smooth curve between the normal points. So the only way is to measure data over a longer period, there is no free lunch.
For the problem you described, the standard way to reduce computation time of the FFT is to use demodulation (or heterodyning, not sure what the official name is). Multiply your data with a sine with a frequency close to your frequency of interest (could be the exact frequency, but that is not necessary), and then decimate your date (low-pass filtering with corner frequency just below the Nyquist frequency of your down-sampled sample rate, followed by down-sampling). In this way, you have much less points, so your FFT will be faster. The resulting spectrum will be similar to your original spectrum, but simply shifted by the demodulation frequency. So when making a plot, simply add f_demod to your x-axis.
One thing to be careful about is that if you multiply with a real sine, your down-sampled spectrum will actually be the sum of two mirrored spectra, since a real sine consists of positive and negative frequencies. There are two solutions to this
demodulate by both a sine and a cosine of the same frequency, so that you obtain 2 spectra, after which taking the sum or difference will get you your spectrum.
demodulate by multiplying with a complex sine of the form exp(2*pi*i*f_demod*t). The input for your FFT will now be complex, so you will have to calculate a two-sided spectrum. But this is exactly what you want, you will get both the frequencies below and above f_demod.
I prefer the second solution. Quick example:
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.mlab import psd
from scipy.signal import decimate
f_line = 123.456
f_demod = 122
f_sample = 1000
t_total = 100
t_win = 10
ratio = 10
t = np.arange(0, t_total, 1 / f_sample)
x = np.sin(2*np.pi*f_line * t) + np.random.randn(len(t)) # sine plus white noise
lo = 2**.5 * np.exp(-2j*np.pi*f_demod * t) # local oscillator
y = decimate(x * lo, ratio) # demodulate and decimate to 100 Hz
z = decimate(y, ratio) # decimate further to 10 Hz
nfft = int(round(f_sample * t_win))
X, fx = psd(x, NFFT = nfft, noverlap = nfft/2, Fs = f_sample)
nfft = int(round(f_sample * t_win / ratio))
Y, fy = psd(y, NFFT = nfft, noverlap = nfft/2, Fs = f_sample / ratio)
nfft = int(round(f_sample * t_win / ratio**2))
Z, fz = psd(z, NFFT = nfft, noverlap = nfft/2, Fs = f_sample / ratio**2)
plt.semilogy(fx, X, fy + f_demod, Y, fz + f_demod, Z)
plt.xlabel('Frequency (Hz)')
plt.ylabel('PSD (V^2/Hz)')
plt.legend(('Full bandwidth FFT', '100 Hz FFT', '10 Hz FFT'))
plt.show()
Result:
If you zoom in, you will note that the results are virtually identical within the pass-band of the decimation filter. One thing to be careful of is that the low-pass filters used in decimate will become numerically instable if you use decimation ratios much larger than 10. The solution to this is to decimate in several passes for large ratios, i.e. to decimate by a factor of 1000, you decimate 3 times by a factor 10.