I'm trying to perform FFT of an EEG signal in Python, and then basing on the bandwidth determine whether it's alpha or beta signal. It looked fine, but the resulting plots are nothing like they should, the frequencies and magnitude values are not what I expected. Any help appreciated, here's the code:
from scipy.io import loadmat
import scipy
import numpy as np
from pylab import *
import matplotlib.pyplot as plt
eeg = loadmat("eeg_2013.mat");
eeg1=eeg['eeg1'][0]
eeg2=eeg['eeg2'][0]
fs = eeg['fs'][0][0]
fft1 = scipy.fft(eeg1)
f = np.linspace (fs,len(eeg1), len(eeg1), endpoint=False)
plt.figure(1)
plt.subplot(211)
plt.plot (f, abs (fft1))
plt.title ('Magnitude spectrum of the signal')
plt.xlabel ('Frequency (Hz)')
show()
plt.subplot(212)
fft2 = scipy.fft(eeg2)
f = np.linspace (fs,len(eeg2), len(eeg2), endpoint=False)
plt.plot (f, abs (fft2))
plt.title ('Magnitude spectrum of the signal')
plt.xlabel ('Frequency (Hz)')
show()
And the plots:
In order to get an array of the fft frequencies, you should use fftfreq; it gives you an array of frequencies to use as absciss:
from scipy.fftpack import fftfreq
eeg = loadmat("eeg_2013.mat");
eeg1=eeg['eeg1'][0]
eeg2=eeg['eeg2'][0]
fs = eeg['fs'][0][0]
fft1 = scipy.fft(eeg1)
f=fftfreq(eeg1.size,1/fs)
Sorry, I can't test this code in real conditions because you didn't post a data sample, but I hope this should work.
Concerning how to determine the bandwidth, as far as I understand, you want to get the fundamental frequency. There are different ways, more or less complicated whether your signal is noisy or not, ... In your case, you only want to know if the fundamental frequency f0 is in the range 8-13Hz (alpha) or 13-30Hz (beta); one very simple way is to compute the maximum of the fft in the range 8-13Hz: fft1[(f>8) & (f<13)].max() and if it's more than, say, 1000, it's an alpha wave, otherwise it's beta. If your signals are less similar, please post some examples of different kinds of samples and the result you would have, so that we can try more complicated algorithms.
If your sampling frequency is fs and you have N=len(eeg1) samples, then the fft procedure will, of course, return an array of N values. The first N/2 of them correspond to the frequency range 0..fs/2, the second half of the frequency corresponds to the mirrored frequency range -fs/2..0. For real input signals the mirrored half is just the complex conjugate of the positive half, so it can be disregarded in further analysis (but not in the inverse fft).
So essentially, you should format
f=linspace(0,N-1,N)*fs/N
Edit: or even more simple with minimal changes to the inital code
f = np.linspace (0,fs,len(eeg1), endpoint=False)
so f ranges from 0 to just before fs and disregard the second half of the fft result in the output:
plt.plot( f(0:N/2), abs( fft1(0:N/2) ) )
Added: You can use fftshift to exchange both halves, then the correct frequency range is
f = np.linspace (-fs/2,fs/2,len(eeg1), endpoint=False)
Related
I tried to create a spectogram of magnitudes using scipy.signal.spectogram.
Unfortunately I didn't get it working.
My test signal should be a sine with frequency 400 Hz and an amplitude of 1. The result for the magnitude of the spectogram seems to be 0.5 instead of 1.0. I have no idea what the problem could be.
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
# 2s time range with 44kHz
t = np.arange(0, 2, 1/44000)
# test signal: sine with 400Hz amplitude 1
x = np.sin(t*2*np.pi*440)
# spectogram for spectrum of magnitudes
f, t, Sxx = signal.spectrogram(x,
44000,
"hanning",
nperseg=1000,
noverlap=0,
scaling="spectrum",
return_onesided=True,
mode="magnitude"
)
# plot last frequency plot
plt.plot(f, Sxx[:,-1])
print("highest magnitude is: %f" %np.max(Sxx))
A strictly real time domain signal is conjugate symmetric in the frequency domain. e.g. will appear in both the positive and negative (or upper) half of a complex result FFT.
Thus you need to add together the two "halves" together of an FFT result to get the total energy (Parseval's theorem). Or just double one side, since complex conjugates have equal magnitudes.
I am look for a way to obtain the frequency from a signal. Here's an example:
signal = [numpy.sin(numpy.pi * x / 2) for x in range(1000)]
This Array will represent the sample of a recorded sound (x = miliseconds)
sin(pi*x/2) => 250 Hrz
How can we go from the signal (list of points), to obtaining the frequencies form this array?
Note:
I have read many Stackoverflow threads and watch many youtube videos. I am yet to find an answer. Please use simple words.
(I am Thankfull for every answer)
What you're looking for is known as the Fourier Transform
A bit of background
Let's start with the formal definition:
The Fourier transform (FT) decomposes a function (often a function of time, or a signal) into its constituent frequencies
This is in essence a mathematical operation that when applied over a signal, gives you an idea of how present each frequency is in the time series. In order to get some intuition behind this, it might be helpful to look at the mathematical definition of the DFT:
Where k here is swept all the way up t N-1 to calculate all the DFT coefficients.
The first thing to notice is that, this definition resembles somewhat that of the correlation of two functions, in this case x(n) and the negative exponential function. While this may seem a little bit abstract, by using Euler's formula and by playing a bit around with the definition, the DFT can be expressed as the correlation with both a sine wave and a cosine wave, which will account for the imaginary and the real parts of the DFT.
So keeping in mind that this is in essence computing a correlation, whenever a corresponding sine or cosine from the decomposition of the complex exponential matches with that of x(n), there will be a peak in X(K), meaning that, such frequency is present in the signal.
How can we do the same with numpy?
So having given a very brief theoretical background, let's consider an example to see how this can be implemented in python. Lets consider the following signal:
import numpy as np
import matplotlib.pyplot as plt
Fs = 150.0; # sampling rate
Ts = 1.0/Fs; # sampling interval
t = np.arange(0,1,Ts) # time vector
ff = 50; # frequency of the signal
y = np.sin(2*np.pi*ff*t)
plt.plot(t, y)
plt.xlabel('Time')
plt.ylabel('Amplitude')
plt.show()
Now, the DFT can be computed by using np.fft.fft, which as mentioned, will be telling you which is the contribution of each frequency in the signal now in the transformed domain:
n = len(y) # length of the signal
k = np.arange(n)
T = n/Fs
frq = k/T # two sides frequency range
frq = frq[:len(frq)//2] # one side frequency range
Y = np.fft.fft(y)/n # dft and normalization
Y = Y[:n//2]
Now, if we plot the actual spectrum, you will see that we get a peak at the frequency of 50Hz, which in mathematical terms it will be a delta function centred in the fundamental frequency of 50Hz. This can be checked in the following Table of Fourier Transform Pairs table.
So for the above signal, we would get:
plt.plot(frq,abs(Y)) # plotting the spectrum
plt.xlabel('Freq (Hz)')
plt.ylabel('|Y(freq)|')
plt.show()
The analytical Fourier transform of a sinusoidal signal is purely imginary. However, when numerically computing discrete Fourier transform, the result is not.
Tldr: Find all answers to this question here.
Consider therefore the following code
import matplotlib.pyplot as plt
import numpy as np
from scipy.fftpack import fft, fftfreq
f_s = 200 # Sampling rate = number of measurements per second in [Hz]
t = np.arange(0,10000, 1 / f_s)
N = len(t)
A = 4 # Amplitude of sinus signal
x = A * np.sin(t)
X = fft(x)[1:N//2]
freqs = (fftfreq(len(x)) * f_s)[1:N//2]
fig, (ax1,ax2) = plt.subplots(2,1, sharex = True)
ax1.plot(freqs, X.real, label = "$\Re[X(\omega)]$")
ax1.plot(freqs, X.imag, label = "$\Im[X(\omega)]$")
ax1.set_title("Discrete Fourier Transform of $x(t) = A \cdot \sin(t)$")
ax1.legend()
ax1.grid(True)
ax2.plot(freqs, np.abs(X), label = "$|X(\omega)|$")
ax2.legend()
ax2.set_xlabel("Frequency $\omega$")
ax2.set_yscale("log")
ax2.grid(True, which = "both")
ax2.set_xlim(0.15,0.175)
plt.show()
Clearly, the absolute value |X(w)| can be used as good approximation to the analytical result. However, the imaginary and real value of the function X(w) are different. Already another question on SO mentioned this fact, but did not explain why. So I can only use the absolute value and the phase?
Another question would be how the Amplitude is related to the numerical result. Mathematically speaking it should be the integral under the curve of |X(w)| divided by normalization (which, as far as I understood, should be given by N), i.e. approximately by
A_approx = np.sum(np.abs(X)) / N
print(f"Numerical value: {A_approx:.1f}, Correct value: {A:.1f}")
Numerical value: 13.5, Correct value: 4.0
This does not seem to be the case. Any insights? Ideas?
Related questions which did not help are here and here.
An FFT does not produce the result you expect because it is finite in length, and thus more similar to the Fourier Transform of a rectangular window on your sinusoid. The length and placement of this rectangular window will affect the phase and amplitude of the FFT result.
I am trying to create an amplitude vs frequency spectrogram of an audio file in Python. what is the procedure to do so?
Some sample code would be of great help.
Simple spectrum
The simplest way to get an amplitude vs. frequency relationship for an evenly sampled signal x is to compute its Discrete Fourier Transform through the efficient Fast Fourier Transform algorithm. Given a signal x sampled at a regular sampling rate fs, you could do this with:
import numpy as np
Xf_mag = np.abs(np.fft.fft(x))
Each index of the Xf_mag array will then contain the amplitude of a frequency bin whose frequency is given by index * fs/len(Xf_mag). These frequencies can be conveniently computed using:
freqs = np.fft.fftfreq(len(Xf_mag), d=1.0/fs)
Finally the spectrum could be plotted using matplotlib:
import matplotlib.pyplot as plt
plt.plot(freqs, Xf_mag)
Refining the spectrum estimation
You might notice that the spectrum obtained with the simple FFT approach yields a spectrum which appears very noisy (ie. with lots of spikes).
To get a more accurate estimate, a more sophisticated approach would be to compute a power spectrum estimate using techniques such as periodograms (implemented by scipy.signal.periodogram) and Welch's method (implemented by scipy.signal.welch). Note however that in these cases the computed spectrum is proportional to the square of the amplitudes, so that its square root provide an estimate of the Root-Mean-Squared (RMS) amplitudes.
Going back to the signal x sampled at a regular sampling rate fs, such a power spectrum estimate could thus be obtained as described in the samples from scipy's documentation with the following:
f, Pxx = signal.periodogram(x, fs)
A_rms = np.sqrt(Pxx)
The corresponding frequencies f are also calculated in the process, so you could then plot the result with
plt.plot(f, A_rms)
Using scipy.signal.welch is quite similar, but uses a slightly different implementation which provides a different accuracy/resolution tradeoff.
from scipy import signal
import matplotlib.pyplot as plt
fs = 10e3
N = 1e5
amp = 2 * np.sqrt(2)
noise_power = 0.01 * fs / 2
time = np.arange(N) / float(fs)
mod = 500*np.cos(2*np.pi*0.25*time)
carrier = amp * np.sin(2*np.pi*3e3*time + mod)
noise = np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
noise *= np.exp(-time/5)
x = carrier + noise
f, t, Sxx = signal.spectrogram(x, fs)
plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
This is pulled from the scipy documentation as you will need scientific computing to create a spectrogram.
install scipy on your machine if you do not have it already and read its documentation:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.spectrogram.html
I have noisy data for which I want to calculate frequency and amplitude. The samples were collected every 1/100th sec. From trends, I believe frequency to be ~ 0.3
When I use numpy fft module, I end up getting very high frequency (36.32 /sec) which is clearly not correct. I tried to filter the data with pandas rolling_mean to remove the noise before fft, but that too didn't work.
import pandas as pd
from numpy import fft
import numpy as np
import matplotlib.pyplot as plt
Moisture_mean_x = pd.read_excel("signal.xlsx", header = None)
Moisture_mean_x = pd.rolling_mean(Moisture_mean_x, 10) # doesn't helps
Moisture_mean_x = Moisture_mean_x.dropna()
Moisture_mean_x = Moisture_mean_x -Moisture_mean_x.mean()
frate = 100. #/sec
Hn = fft.fft(Moisture_mean_x)
freqs = fft.fftfreq(len(Hn), 1/frate)
idx = np.argmax(np.abs(Hn))
freq_in_hertz = freqs[idx]
Can someone guide me how to fix this?
You are right there is something wrong. One needs to explictiy ask pandas for the zeroth column:
Hn = np.fft.fft(Moisture_mean_x[0])
Else something wrong happen, which you can see by the fact that the FFT result was not symetric, which should be the case for real input.
Seems like #tillsten already answered your question, but here is some additional confirmation. The first plot is your data (zero mean and I changed it to a csv). The second is the power spectral density and you can see a fat mass with a peak at ~0.3 Hz. I 'zoomed' in on the third plot to see if there was a second hidden frequency close to the main frequency.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
x = pd.read_csv("signal.csv")
x = np.array(x, dtype=float)[:,0]
x = x - np.mean(x)
fs = 1e2
f, Pxx = signal.welch(x, fs, nperseg=1024)
f_res, Pxx_res = signal.welch(x, fs, nperseg=2048)
plt.subplot(3,1,1)
plt.plot(x)
plt.subplot(3,1,2)
plt.plot(f, Pxx)
plt.xlim([0, 1])
plt.xlabel('frequency [Hz]')
plt.ylabel('PSD')
plt.subplot(3,1,3)
plt.plot(f_res, Pxx_res)
plt.xlim([0, 1])
plt.xlabel('frequency [Hz]')
plt.ylabel('PSD')
plt.show()
Hn = fft.fft(x)
freqs = fft.fftfreq(len(Hn), 1/fs)
idx = np.argmax(np.abs(Hn))
freq_in_hertz = freqs[idx]
print 'Main freq:', freq_in_hertz
print 'RMS amp:', np.sqrt(Pxx.max())
This prints:
Main freq: 0.32012805122
RMS amp: 0.0556044913489
An FFT is a filter bank. Just look for the magnitude peak only within the expected frequency range in the FFT result (instead of the entire result vector), and most of the other spectrum will essentially be filtered out.
It isn't necessary to filter the signal beforehand, because the FFT is a filter. Just skip those parts of the FFT that correspond to frequencies you know to contain a lot of noise - zero them out, or otherwise exclude them.
I hope this may help you.
https://scipy-cookbook.readthedocs.io/items/ButterworthBandpass.html
You should filter only the band around the expected frequency and improve the signal noise ratio before applying the FFT.
Edit:
Mark Ransom gave a smarter answer, if you have to do the FFT you can just cut off the noise after the transformation. It won't give a worse result than a filter would.
You should use a low pass filter, which should keep the larger periodic variations and smooth out some of the higher frequency stuff first. After that, then can do FFT to get the peaks. Here is a recipe for FIR filter typically used for this exact sort of thing.