So, I am trying to figure out how to use DFT in practice to detect prevalent frequencies in a signal. I have been trying to wrap my head around what Fourier transforms are and how DFT algorithms work, but apparently I still have ways to go. I have written some code to generate a signal (since the intent is to work with music, I generated a major C chord, hence the weird frequency values) and then tried to work back to the frequency numbers. Here is the code I have
sr = 44100 # sample rate
x = np.linspace(0, 1, sr) # one second of signal
tpi = 2 * np.pi
data = np.sin(261.63 * tpi * x) + np.sin(329.63 * tpi * x) + np.sin(392.00 * tpi * x)
freqs = np.fft.fftfreq(sr)
fft = np.fft.fft(data)
idx = np.argsort(np.abs(fft))
fft = fft[idx]
freqs = freqs[idx]
print(freqs[-6:] * sr)
This gives me [-262. 262. -330. 330. -392. 392.]
which is different from the frequencies I encoded (261.63, 329.63 and 392.0). What am I doing wrong and how do I fix it?
Indeed, if the frame lasts T seconds, the frequencies of the DFT are k/T Hz, where k is an integer. As a consequence, oversampling does not improve the accuracy of the estimated frequency, as long as these frequencies are identifed as maxima of the magnitude of the DFT. On the contrary, considering longer frames lasting 100s would induce a spacing between the DFT frequencies of 0.01Hz, which might be good enough to produce the expected frequency. It is possible to due much better, by estimating the frequency of a peak as its mean frequency wih respect to power density.
Figure 1: even after applying a Tuckey window, the DFT of the windowed signal is not a sum of Dirac: there is still some spectral leakage at the bottom of the peaks. This power must be accounted for as the frequencies are estimated.
Another issue is that the length of the frame is not a multiple of the period of the signal, which may not be periodic anyway. Nevertheless, the DFT is computed as if the signal were periodic but discontinuous at the edge of the frame. It induce spurous frequencies described as spectral leakage. Windowing is the reference method to deal with such problems and mitigate the problem related to the artificial discontinuity. Indeed, the value of a window continuously decrease to zero near the edges of the frame. There is a list of window functions and a lot of window functions are available in scipy.signal. A window is applied as:
tuckey_window=signal.tukey(len(data),0.5,True)
data=data*tuckey_window
At that point, the frequencies exibiting the largest magnitude still are 262, 330 and 392. Applying a window only makes the peaks more visible: the DFT of the windowed signal features three distinguished peaks, each featuring a central lobe and side lobes, depending on the DFT of the window. The lobes of these windows are symmetric: the central frequency can therefore be computed as the mean frequency of the peak, with respect to power density.
import numpy as np
from scipy import signal
import scipy
sr = 44100 # sample rate
x = np.linspace(0, 1, sr) # one second of signal
tpi = 2 * np.pi
data = np.sin(261.63 * tpi * x) + np.sin(329.63 * tpi * x) + np.sin(392.00 * tpi * x)
#a window...
tuckey_window=signal.tukey(len(data),0.5,True)
data=data*tuckey_window
data -= np.mean(data)
fft = np.fft.rfft(data, norm="ortho")
def abs2(x):
return x.real**2 + x.imag**2
fftmag=abs2(fft)[:1000]
peaks, _= signal.find_peaks(fftmag, height=np.max(fftmag)*0.1)
print "potential frequencies ", peaks
#compute the mean frequency of the peak with respect to power density
powerpeak=np.zeros(len(peaks))
powerpeaktimefrequency=np.zeros(len(peaks))
for i in range(1000):
dist=1000
jnear=0
for j in range(len(peaks)):
if dist>np.abs(i-peaks[j]):
dist=np.abs(i-peaks[j])
jnear=j
powerpeak[jnear]+=fftmag[i]
powerpeaktimefrequency[jnear]+=fftmag[i]*i
powerpeaktimefrequency=np.divide(powerpeaktimefrequency,powerpeak)
print 'corrected frequencies', powerpeaktimefrequency
The resulting estimated frequencies are 261.6359 Hz, 329.637Hz and 392.0088 Hz: it much better than 262, 330 and 392Hz and it satisfies the required 0.01Hz accuracy for such a pure noiseless input signal.
DFT result bins are separated by Fs/N in frequency, where N is the length of the FFT. Thus, the duration of your DFT window limits the resolution in terms of DFT result bin frequency center spacings.
But, for well separated frequency peaks in low noise (high S/N), instead of increasing the duration of the data, you can instead estimate the frequency peak locations to a higher resolution by interpolating the DFT result between the DFT result bins. You can try parabolic interpolation for a coarse frequency peak location estimate, but windowed Sinc interpolation (essentially Shannon-Whittaker reconstruction) would provide far better frequency estimation accuracy and resolution (given a low enough noise floor around the frequency peak(s) of interest, e.g. no nearby sinusoids in your artificial waveform case).
Since you want to get a resolution of 0.01 Hz, you will need to sample at least 100 sec worth of data. You will be able to resolve frequencies up to about 22.05 kHz.
Related
I'm trying to do some tests before I proceed analyzing some real dataset via FFT, and I've found the following problem.
First, I create a signal as the sum of two cosines and then use rfft to to the transformation (since it has only real values):
import numpy as np
import matplotlib.pyplot as plt
from scipy.fft import rfft, rfftfreq
# Number of sample points
N = 800
# Sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = 0.5*np.cos(10*2*np.pi*x) + 0.5*np.cos(200*2*np.pi*x)
# FFT
yf = rfft(y)
xf = rfftfreq(N, T)
fig, ax = plt.subplots(1,2,figsize=(15,5))
ax[0].plot(x,y)
ax[1].plot(xf, 2.0/N*np.abs(yf))
As it can be seen from the definition of the signal, I have two oscillations with amplitude 0.5 and frequency 10 and 200. Now, I would expect the FFT spectrum to be something like two deltas at those points, but apparently increasing the frequency broadens the peaks:
From the first peak it can be infered that the amplitude is 0.5, but not for the second. I've tryied to obtain the area under the peak using np.trapz and use that as an estimate for the amplitude, but as it is close to a dirac delta it's very sensitive to the interval I choose. My problem is that I need to get the amplitude as exact as possible for my data analysis.
EDIT: As it seems to be something related with the number of points, I decided to increment (now that I can) the sample frequency. This seems to solve the problem, as it can be seen in the figure:
However, it still seems strange that for a certain number of points and sample frequency, the high frequency peaks broaden...
It is not strange , you have leakage of the frequency bins. When you discretize the signal (sampling) needed for the Fourier transfrom , frequency bins are created which are frequency intervals where the the amplitude is calculated. And each bin has wide which is given by the sample_rate / num_points . So , the less the number of bins the more difficult is to assign precise amplitudes to every frequency. Other problems in choosing the best sampling rate exist such as the shannon-nyquist theorem to prevent aliasing. https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem . But depending on the problem sometimes there some custom rates used for sampling. E.g. when dealing with audio a sampling rate of 44,100 Hz is widely used , cause is based on the limits of the human hearing. So it depends also on nature of the data you want to perform analysis as you wrote. Anyway , since this question has also theoretical value , you can also check https://dsp.stackexchange.com for some useful info.
I would comment to George's answer, but yet I cannot.
Maybe a starting point for your research are the properties of the Discrete Fourier Transform.
The signal in the time domain is actual the cosines multiplied by a box window which transforms into the frequency domain as the convolution of the deltas with the sinc function. The sinc functions will smear the spectrum.
However, I am not sure we are observing spectral leakage here, since the window fits exactly to the full period of cosines. The discretization of the bins might still play a role here.
I have 3d-array of accelerator signal data which sampled in 50 Hz meaning that the time step is 1/50=.02. My goal is to compute the main frequency of this sensor using Numpy or Scipy. My question is that should I compute the frequency of each column separately, using multidimensional fft or computing single Vector and then compute fft.
I used the following function to compute the main frequency.
from scipy import fftpack
import numpy as np
def fourier(signal, timestep):
data = signal - np.mean(signal)
N = len(data) // 2 # we need half of data
freq = fftpack.fftfreq(len(data), d=timestep)[:N]
fft = fftpack.fft(data)[:N]
amp = np.abs(fft) / N
order = np.argsort(amp)[::-1] ## sort based on the importance
return freq[order][0]
A 3D array of accelerometer sensors produces an array of 5 dimensions: the space coordinates, time and the components of the acceleration.
Taking the DFT over the time dimension corresponds to analysing sensors one at a time: each sensor would produce a main frequency, likely slightly different from one sensor to another, as if the sensors were uncoupled.
As an alternative, let's think about taking the DFT over both spacial coordinates and time. It corresponds to writing the compound signal as a sum of sinusoidal plane waves:
where Ǹ is a scaling factor obtained by multiplying the number of points to the number of time samples. In the sequel, I'll drop this global scaling independent from x,y,z,t,k_x,k_y,k_z and w.
At this point, modeling the physics generating this acceleration would be a significant asset. Indeed, using this DFT makes little sense if the phenomenon is dispersive. Nevetheless, the diffusion, elasticity or acoustics in an uniform material are non-dispersive: each frequency lives indepently from the others. Furthermore, knowing the physics is useful as an energy can be defined. For instance, the kinetic energy associated to the wave k_x,k_y,k_z,w writes:
Therefore, the kinetic energy associated to a given frequency w writes:
As a consequence, this reasoning provides a physically-based way to merge the pointwise DFTs over time . Indeed, according to the Parseval's identity:
Regarding practical considerations, substracting the average as you did is indeed a good start. If computing the velocity is considered by multiplying by 1/w^2, the zero frequency (i.e. the average) is to be zeroed, to avoid occurence of infinite or Nan.
Moreover, applying a window prior to computing the time DFT could help limit problems related to spectral leakage. DFT is designed for periodic signals of periods consistent with that of the frame. More specifically, it computes the Fourier transform of a signal built by repeating your frame again and again. As a consequence, artifical discontinuities may appear at the edges, inducing misleading non-existing frequencies. Windows drops near zero close to the edge of the frame, thus reducing the discontinuities and their effect. As a consequence, it could be suggested to apply a window to the space dimensions as well, to keep the consistency with the physical plane wave decomposition. It would result in giving more weight to the accelerators at the center of the 3D array.
The plane wave decomposition also requires that the spacial spacing of the sensor must be about twice smaller than the expected wavelength. Otherwise, another phenomenon called aliasing occurs. Nevertheless, the power spectrum W(w) might be less sensitive to this issue than the plane wave decomposition. On the contrary, if the elastic strain energy is computed starting from the acceleration, aliasing could become a real problem, because computing the strain requires derivative with respect to space coordinates, i.e. multiplication by k_x, k_y or k_z, and space aliasing corresponds to using the wrong k_x.
Once W(w) is computed, the frequencies corresponding to each peak can be estimated by computing the mean frequency over the peak with respect to power density as in Why are frequency values rounded in signal using FFT? .
Here is a sample code generating some plane waves of frequencies not consistent with the size of the frame (both time and space). Hanning windows are applied, the kinetic energy is computed and the frequencies corresponding to each peak are retreived.
import matplotlib.pyplot as plt
import numpy as np
from scipy import signal
import scipy
spacingx=1.
spacingy=1.
spacingz=1.
spacingt=1./50.
Nx=5
Ny=5
Nz=5
Nt=512
frequency1=9.5
frequency2=13.7
frequency3=22.3
#building a signal
acc=np.zeros((Nx,Ny,Nz,Nt,3))
for i in range(Nx):
for j in range(Ny):
for k in range(Nz):
for l in range(Nt):
acc[i,j,k,l,0]=np.sin(i*spacingx+j*spacingy-2*np.pi*frequency1*l*spacingt)
acc[i,j,k,l,1]=np.sin(i*spacingx+1.5*k*spacingz-2*np.pi*frequency2*l*spacingt)
acc[i,j,k,l,2]=np.sin(1.5*i*spacingx+k*spacingz-2*np.pi*frequency3*l*spacingt)
#applying a window both in time and space
hanningx=np.hanning(Nx)
hanningy=np.hanning(Ny)
hanningz=np.hanning(Nz)
hanningt=np.hanning(Nt)
for i in range(Nx):
hx=hanningx[i]
for j in range(Ny):
hy=hanningy[j]
for k in range(Nz):
hz=hanningx[k]
for l in range(Nt):
ht=hanningt[l]
acc[i,j,k,l,0]*=hx*hy*hz*ht
acc[i,j,k,l,1]*=hx*hy*hz*ht
acc[i,j,k,l,2]*=hx*hy*hz*ht
#computing the DFT over time.
acctilde=np.fft.fft(acc,axis=3)
#kinetic energy
print acctilde.shape[3]
kineticW=np.zeros(acctilde.shape[3])
frequencies=np.fft.fftfreq(Nt, spacingt)
for l in range(Nt):
oneonomegasquared=0.
if l>0:
oneonomegasquared=1.0/(frequencies[l]*frequencies[l])
for i in range(Nx):
for j in range(Ny):
for k in range(Nz):
kineticW[l]+= oneonomegasquared*(np.real(np.vdot(acctilde[i,j,k,l,:],acctilde[i,j,k,l,:])))
plt.plot(frequencies[0:acctilde.shape[3]],kineticW,'k-',label=r'$W(f)$')
#plt.plot(xi,np.real(fourier),'k-', lw=3, color='red', label=r'$f$, Hz')
plt.legend()
plt.show()
# see https://stackoverflow.com/questions/54714169/why-are-frequency-values-rounded-in-signal-using-fft/54775867#54775867
peaks, _= signal.find_peaks(kineticW, height=np.max(kineticW)*0.1)
print "potential frequencies index", peaks
#compute the mean frequency of the peak with respect to power density
powerpeak=np.zeros(len(peaks))
powerpeaktimefrequency=np.zeros(len(peaks))
for i in range(len(kineticW)):
dist=1000
jnear=0
for j in range(len(peaks)):
if dist>np.abs(i-peaks[j]):
dist=np.abs(i-peaks[j])
jnear=j
powerpeak[jnear]+=kineticW[i]
powerpeaktimefrequency[jnear]+=kineticW[i]*frequencies[i]
powerpeaktimefrequency=np.divide(powerpeaktimefrequency,powerpeak)
print 'corrected frequencies', powerpeaktimefrequency
I am look for a way to obtain the frequency from a signal. Here's an example:
signal = [numpy.sin(numpy.pi * x / 2) for x in range(1000)]
This Array will represent the sample of a recorded sound (x = miliseconds)
sin(pi*x/2) => 250 Hrz
How can we go from the signal (list of points), to obtaining the frequencies form this array?
Note:
I have read many Stackoverflow threads and watch many youtube videos. I am yet to find an answer. Please use simple words.
(I am Thankfull for every answer)
What you're looking for is known as the Fourier Transform
A bit of background
Let's start with the formal definition:
The Fourier transform (FT) decomposes a function (often a function of time, or a signal) into its constituent frequencies
This is in essence a mathematical operation that when applied over a signal, gives you an idea of how present each frequency is in the time series. In order to get some intuition behind this, it might be helpful to look at the mathematical definition of the DFT:
Where k here is swept all the way up t N-1 to calculate all the DFT coefficients.
The first thing to notice is that, this definition resembles somewhat that of the correlation of two functions, in this case x(n) and the negative exponential function. While this may seem a little bit abstract, by using Euler's formula and by playing a bit around with the definition, the DFT can be expressed as the correlation with both a sine wave and a cosine wave, which will account for the imaginary and the real parts of the DFT.
So keeping in mind that this is in essence computing a correlation, whenever a corresponding sine or cosine from the decomposition of the complex exponential matches with that of x(n), there will be a peak in X(K), meaning that, such frequency is present in the signal.
How can we do the same with numpy?
So having given a very brief theoretical background, let's consider an example to see how this can be implemented in python. Lets consider the following signal:
import numpy as np
import matplotlib.pyplot as plt
Fs = 150.0; # sampling rate
Ts = 1.0/Fs; # sampling interval
t = np.arange(0,1,Ts) # time vector
ff = 50; # frequency of the signal
y = np.sin(2*np.pi*ff*t)
plt.plot(t, y)
plt.xlabel('Time')
plt.ylabel('Amplitude')
plt.show()
Now, the DFT can be computed by using np.fft.fft, which as mentioned, will be telling you which is the contribution of each frequency in the signal now in the transformed domain:
n = len(y) # length of the signal
k = np.arange(n)
T = n/Fs
frq = k/T # two sides frequency range
frq = frq[:len(frq)//2] # one side frequency range
Y = np.fft.fft(y)/n # dft and normalization
Y = Y[:n//2]
Now, if we plot the actual spectrum, you will see that we get a peak at the frequency of 50Hz, which in mathematical terms it will be a delta function centred in the fundamental frequency of 50Hz. This can be checked in the following Table of Fourier Transform Pairs table.
So for the above signal, we would get:
plt.plot(frq,abs(Y)) # plotting the spectrum
plt.xlabel('Freq (Hz)')
plt.ylabel('|Y(freq)|')
plt.show()
Is there a way to generate a quasi periodic signal (a signal with a specific frequency distribution, like a normal distribution)? In addition,
the signal should not have a stationary frequency distribution since the inverse Fourier transform of a Gaussian function is still a Gaussian function, while what I want is an oscillating signal.
I used a discrete series of Normally distributed frequencies to generate the signal, that is
The frequencies distribute like this:
So with initial phases
, I got the signal
However, the signal is like
and its FFT spectrum is like
.
I found that the final spectrum is only similar to a Gaussian function within a short time period since t=0 (corresponding to the left few peaks in figure4 which are extremely high), and the rest of the signal only contributed to the glitches on both sides of the peak in figure5.
I thought the problem may have come from the initial phases. I tried randomly distributed initial phases but it also didn't work.
So, what is the right way to generate such a signal?
Here is my python code:
import numpy as np
from scipy.special import erf, erfinv
def gaussian_frequency(array_length = 10000, central_freq = 100, std = 10):
n = np.arange(array_length)
f = np.sqrt(2)*std*erfinv(2*n/array_length - erf(central_freq/np.sqrt(2)/std)) + central_freq
return f
f = gaussian_frequency()
phi = np.linspace(0,2*np.pi, len(f))
t = np.linspace(0,100,100000)
signal = np.zeros(len(t))
for k in range(len(f)):
signal += np.sin(phi[k] + 2*np.pi*f[k]*t)
def fourierPlt(signal, TIMESTEP = .001):
num_samples = len(signal)
k = np.arange(num_samples)
Fs = 1/TIMESTEP
T = num_samples/Fs
frq = k/T # two sides frequency range
frq = frq[range(int(num_samples/2))] # one side frequency range
fourier = np.fft.fft(signal)/num_samples # fft computing and normalization
fourier = abs(fourier[range(int(num_samples/2))])
fourier = fourier/sum(fourier)
plt.plot(frq, fourier, 'r', linewidth = 1)
plt.title("Fast Fourier Transform")
plt.xlabel('$f$/Hz')
plt.ylabel('Normalized Spectrum')
return(frq, fourier)
fourierPlt(signal)
If you want your signal to be real-valued, you need to mirror the frequency component: you need the positive and negative frequencies to be complex conjugates of each other. I presume you thought of this.
A Gaussian-shaped frequency spectrum (with mean at f=0) yields a Gaussian-shaped signal.
Shifting the frequency spectrum by a frequency f0 causes the time-domain signal to be multiplied by exp(j 2 π f0 t). That is, you only change its phase.
Assuming you still want a real-valued time signal, you'll have to duplicate the frequency spectrum and shift it in both directions. This causes a multiplication by
exp(j 2 π f0 t)+exp(-j 2 π f0 t) = 2 cos(2 π f0 t) .
Thus, your signal is a Gaussian modulating a cosine.
I'm using MATLAB here for the example, I hope you can easily translate this to Python:
t=0:300;
s=exp(-(t-150).^2/30.^2) .* cos(2*pi*0.1*t);
subplot(2,1,1)
plot(t,s)
xlabel('time')
S=abs(fftshift(fft(s)));
f=linspace(-0.5,0.5,length(S));
subplot(2,1,2)
plot(f,S)
xlabel('frequency')
For those interested in image processing: the Gabor filter is exactly this, but with the frequency spectrum shifted only one direction. The resulting filter is complex, the magnitude of the filtering result is used. This leads to a phase-independent filter.
I have an arbitrary signal and I need to know the frequency spectrum of the signal, which I obtain by doing an FFT. The issue is, I need lots of resolution only around this one particular frequency. The issue is, if I increase my window width, or if I up the sample rate, it goes too slow and I end up with a lot of detail everywhere. I only want a lot of detail in one point, and minimal detail everywhere else.
I tried using a Goertzel filter around just the area I need, and then FFT everywhere else, but that didn't get me any more resolution, which I suppose was to be expected.
Any ideas? My only idea at the moment is to sweep and innerproduct around the value I want.
Thanks.
Increasing the sample rate will not give you a higher spectral resolution, it will only give you more high-frequency information, which you are not interested in. The only way to increase spectral resolution is to increase the window length. There is a way to increase the length of your window artificially by zero-padding, but this only gives you 'fake resolution', it will just yield a smooth curve between the normal points. So the only way is to measure data over a longer period, there is no free lunch.
For the problem you described, the standard way to reduce computation time of the FFT is to use demodulation (or heterodyning, not sure what the official name is). Multiply your data with a sine with a frequency close to your frequency of interest (could be the exact frequency, but that is not necessary), and then decimate your date (low-pass filtering with corner frequency just below the Nyquist frequency of your down-sampled sample rate, followed by down-sampling). In this way, you have much less points, so your FFT will be faster. The resulting spectrum will be similar to your original spectrum, but simply shifted by the demodulation frequency. So when making a plot, simply add f_demod to your x-axis.
One thing to be careful about is that if you multiply with a real sine, your down-sampled spectrum will actually be the sum of two mirrored spectra, since a real sine consists of positive and negative frequencies. There are two solutions to this
demodulate by both a sine and a cosine of the same frequency, so that you obtain 2 spectra, after which taking the sum or difference will get you your spectrum.
demodulate by multiplying with a complex sine of the form exp(2*pi*i*f_demod*t). The input for your FFT will now be complex, so you will have to calculate a two-sided spectrum. But this is exactly what you want, you will get both the frequencies below and above f_demod.
I prefer the second solution. Quick example:
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.mlab import psd
from scipy.signal import decimate
f_line = 123.456
f_demod = 122
f_sample = 1000
t_total = 100
t_win = 10
ratio = 10
t = np.arange(0, t_total, 1 / f_sample)
x = np.sin(2*np.pi*f_line * t) + np.random.randn(len(t)) # sine plus white noise
lo = 2**.5 * np.exp(-2j*np.pi*f_demod * t) # local oscillator
y = decimate(x * lo, ratio) # demodulate and decimate to 100 Hz
z = decimate(y, ratio) # decimate further to 10 Hz
nfft = int(round(f_sample * t_win))
X, fx = psd(x, NFFT = nfft, noverlap = nfft/2, Fs = f_sample)
nfft = int(round(f_sample * t_win / ratio))
Y, fy = psd(y, NFFT = nfft, noverlap = nfft/2, Fs = f_sample / ratio)
nfft = int(round(f_sample * t_win / ratio**2))
Z, fz = psd(z, NFFT = nfft, noverlap = nfft/2, Fs = f_sample / ratio**2)
plt.semilogy(fx, X, fy + f_demod, Y, fz + f_demod, Z)
plt.xlabel('Frequency (Hz)')
plt.ylabel('PSD (V^2/Hz)')
plt.legend(('Full bandwidth FFT', '100 Hz FFT', '10 Hz FFT'))
plt.show()
Result:
If you zoom in, you will note that the results are virtually identical within the pass-band of the decimation filter. One thing to be careful of is that the low-pass filters used in decimate will become numerically instable if you use decimation ratios much larger than 10. The solution to this is to decimate in several passes for large ratios, i.e. to decimate by a factor of 1000, you decimate 3 times by a factor 10.