Scipy FFT Frequency Analysis of very noisy signal

Scipy FFT Frequency Analysis of very noisy signal - python

I have noisy data for which I want to calculate frequency and amplitude. The samples were collected every 1/100th sec. From trends, I believe frequency to be ~ 0.3
When I use numpy fft module, I end up getting very high frequency (36.32 /sec) which is clearly not correct. I tried to filter the data with pandas rolling_mean to remove the noise before fft, but that too didn't work.
import pandas as pd
from numpy import fft
import numpy as np
import matplotlib.pyplot as plt
Moisture_mean_x = pd.read_excel("signal.xlsx", header = None)
Moisture_mean_x = pd.rolling_mean(Moisture_mean_x, 10) # doesn't helps
Moisture_mean_x = Moisture_mean_x.dropna()
Moisture_mean_x = Moisture_mean_x -Moisture_mean_x.mean()
frate = 100. #/sec
Hn = fft.fft(Moisture_mean_x)
freqs = fft.fftfreq(len(Hn), 1/frate)
idx = np.argmax(np.abs(Hn))
freq_in_hertz = freqs[idx]
Can someone guide me how to fix this?

You are right there is something wrong. One needs to explictiy ask pandas for the zeroth column:
Hn = np.fft.fft(Moisture_mean_x[0])
Else something wrong happen, which you can see by the fact that the FFT result was not symetric, which should be the case for real input.

Seems like #tillsten already answered your question, but here is some additional confirmation. The first plot is your data (zero mean and I changed it to a csv). The second is the power spectral density and you can see a fat mass with a peak at ~0.3 Hz. I 'zoomed' in on the third plot to see if there was a second hidden frequency close to the main frequency.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
x = pd.read_csv("signal.csv")
x = np.array(x, dtype=float)[:,0]
x = x - np.mean(x)
fs = 1e2
f, Pxx = signal.welch(x, fs, nperseg=1024)
f_res, Pxx_res = signal.welch(x, fs, nperseg=2048)
plt.subplot(3,1,1)
plt.plot(x)
plt.subplot(3,1,2)
plt.plot(f, Pxx)
plt.xlim([0, 1])
plt.xlabel('frequency [Hz]')
plt.ylabel('PSD')
plt.subplot(3,1,3)
plt.plot(f_res, Pxx_res)
plt.xlim([0, 1])
plt.xlabel('frequency [Hz]')
plt.ylabel('PSD')
plt.show()
Hn = fft.fft(x)
freqs = fft.fftfreq(len(Hn), 1/fs)
idx = np.argmax(np.abs(Hn))
freq_in_hertz = freqs[idx]
print 'Main freq:', freq_in_hertz
print 'RMS amp:', np.sqrt(Pxx.max())
This prints:
Main freq: 0.32012805122
RMS amp: 0.0556044913489

An FFT is a filter bank. Just look for the magnitude peak only within the expected frequency range in the FFT result (instead of the entire result vector), and most of the other spectrum will essentially be filtered out.

It isn't necessary to filter the signal beforehand, because the FFT is a filter. Just skip those parts of the FFT that correspond to frequencies you know to contain a lot of noise - zero them out, or otherwise exclude them.

I hope this may help you.
https://scipy-cookbook.readthedocs.io/items/ButterworthBandpass.html
You should filter only the band around the expected frequency and improve the signal noise ratio before applying the FFT.
Edit:
Mark Ransom gave a smarter answer, if you have to do the FFT you can just cut off the noise after the transformation. It won't give a worse result than a filter would.

You should use a low pass filter, which should keep the larger periodic variations and smooth out some of the higher frequency stuff first. After that, then can do FFT to get the peaks. Here is a recipe for FIR filter typically used for this exact sort of thing.

Related

Odd result in python scipy FFT

I have access scipy and want to create a FFT about simple Gaussian function which is exp(-t^2). And also it's well known that fourier transform of exp(−t^2) is √πexp(−π^2*k^2). But FFT of exp(-t^2) was not same as √πexp(−π^2*k^2).
I have tried the following code:
import scipy.fftpack as fft
from scipy import integrate
import numpy as np
import matplotlib.pyplot as plt
#FFT
N=int(1e+3)
T=0.01 #sample period
t = np.linspace(0,N*T, N)
h=np.exp(-t**2)
H_shift=2*np.abs(fft.fftshift(np.fft.fft(h)/N))
freq=fft.fftshift(fft.fftfreq(h.shape[0],t[1]-t[0]))
#Comparing FFT with fourier transform
def f(x):
return np.exp(-x**2)
def F(k):
return (np.pi**0.5)*np.exp((-np.pi**2)*(k**2))
plt.figure(num=1)
plt.plot(freq,F(freq),label=("Fourier Transform"))
plt.legend()
plt.figure(num=2)
plt.plot(freq,H_shift,label=("FFT"))
plt.legend()
plt.show()
#Checking Parseval's Theorm
S_h=integrate.simps(h**2,t)
#0.62665690150683084
S_H_s=integrate.simps(H_shift**2,freq)
#0.025215875346935791
S_F=integrate.simps(F(freq)**2,freq)
#1.2533141373154999
The graph I plotted is not same, also values of FFT do not follow Parseval's theorm. . It has to be S_H_s=S_h*2, but my result was not. I think that S_H_s which is result of FFT is wrong value Because of S_F=S_h*2.
Is there any problem in my code?? Help is greatly appreciated! Thanks in advance.

I suggest you plot your input signal h and verify that it looks like a Gaussian.
Spoiler alert: it doesn't, it is half a Gaussian!
By cutting it like this, you introduce a lot of high frequencies that you see in your plot.
To do this experiment correctly, follow this recipe to create your input signal:
t = np.linspace(-(N/2)*T,(N/2-1)*T, N)
h = np.exp(-t**2)
h = fft.ifftshift(h)
The ifftshift function serves to move the t=0 location to the leftmost array element. Note that t here is constructed carefully such that t=0 is exactly in the right place for this to work correctly, assuming an even-sized N. You can verify that fft.ifftshift(t)[0] is 0.0.

Using scipy.fftpack.fft how to interprete numerical result of Fourier Transform

The analytical Fourier transform of a sinusoidal signal is purely imginary. However, when numerically computing discrete Fourier transform, the result is not.
Tldr: Find all answers to this question here.
Consider therefore the following code
import matplotlib.pyplot as plt
import numpy as np
from scipy.fftpack import fft, fftfreq
f_s = 200 # Sampling rate = number of measurements per second in [Hz]
t = np.arange(0,10000, 1 / f_s)
N = len(t)
A = 4 # Amplitude of sinus signal
x = A * np.sin(t)
X = fft(x)[1:N//2]
freqs = (fftfreq(len(x)) * f_s)[1:N//2]
fig, (ax1,ax2) = plt.subplots(2,1, sharex = True)
ax1.plot(freqs, X.real, label = "$\Re[X(\omega)]$")
ax1.plot(freqs, X.imag, label = "$\Im[X(\omega)]$")
ax1.set_title("Discrete Fourier Transform of $x(t) = A \cdot \sin(t)$")
ax1.legend()
ax1.grid(True)
ax2.plot(freqs, np.abs(X), label = "$|X(\omega)|$")
ax2.legend()
ax2.set_xlabel("Frequency $\omega$")
ax2.set_yscale("log")
ax2.grid(True, which = "both")
ax2.set_xlim(0.15,0.175)
plt.show()
Clearly, the absolute value |X(w)| can be used as good approximation to the analytical result. However, the imaginary and real value of the function X(w) are different. Already another question on SO mentioned this fact, but did not explain why. So I can only use the absolute value and the phase?
Another question would be how the Amplitude is related to the numerical result. Mathematically speaking it should be the integral under the curve of |X(w)| divided by normalization (which, as far as I understood, should be given by N), i.e. approximately by
A_approx = np.sum(np.abs(X)) / N
print(f"Numerical value: {A_approx:.1f}, Correct value: {A:.1f}")
Numerical value: 13.5, Correct value: 4.0
This does not seem to be the case. Any insights? Ideas?
Related questions which did not help are here and here.

An FFT does not produce the result you expect because it is finite in length, and thus more similar to the Fourier Transform of a rectangular window on your sinusoid. The length and placement of this rectangular window will affect the phase and amplitude of the FFT result.

FFT normalization with numpy

Just started working with numpy package and started it with the simple task to compute the FFT of the input signal. Here's the code:
import numpy as np
import matplotlib.pyplot as plt
#Some constants
L = 128
p = 2
X = 20
x = np.arange(-X/2,X/2,X/L)
fft_x = np.linspace(0,128,128, True)
fwhl = 1
fwhl_y = (2/fwhl) \
*(np.log([2])/np.pi)**0.5*np.e**(-(4*np.log([2]) \
*x**2)/fwhl**2)
fft_fwhl = np.fft.fft(fwhl_y, norm='ortho')
ampl_fft_fwhl = np.abs(fft_fwhl)
plt.bar(fft_x, ampl_fft_fwhl, width=.7, color='b')
plt.show()
Since I work with an exponential function with some constant divided by pi before it, I expect to get the exponential function in Fourier space, where the constant part of the FFT is always equal to 1 (zero frequency).
But the value of that component I get using numpy is larger (it's about 1,13). Here I have an amplitude spectrum which is normalized by 1/(number_of_counts)**0.5 (that's what I read in numpy documentation). I can't understand what's wrong... Can anybody help me?
Thanks!
[EDITED] It seems like the problem is solved, all you need to get the same result of Fourier integral and of FFT is to multiply FFT by the step (in my case it's X/L). And as for normalization as option of numpy.fft.fft(..., norm='ortho'), it's used only to save the scale of the transform, otherwise you'll need to divide the result of the inverse FFT by the number of samples. Thanks everyone for their help!

I've finally solved my problem. All you need to bond FFT with Fourier integral is to multiply the result of the transform (FFT) by the step (X/L in my case, FFTX/L), it works in general. In my case it's a bit more complex since I have an extra rule for the function to be transformed. I have to be sure that the area under the curve is equal to 1, because it's a model of δ function, so since the step is unchangeable, I have to fulfill stepsum(fwhl_y)=1 condition, that is X/L=1/sum(fwhl_y). So to get the correct result I have to make following things:
to calculate FFT fft_fwhl = np.fft.fft(fwhl_y)
to get rid of phase component which comes due to the symmetry of fwhl_y function, that is the function defined in [-T/2,T/2] interval, where T is period and np.fft.fft operation thinks that my function is defined in [0,T] interval. So to get amplitude spectrum only (that's what I need) I simply use np.abs(FFT)
to get the values I expect I should multiply the result I got on previous step by X/L, that is np.abs(FFT)*X/L
I have an extra condition on the area under the curve, so it's X/L*sum(fwhl_y)=1 and I finally come to np.abs(FFT)*X/L = np.abs(FFT)/sum(fwhl_y)
Hope it'll help anyone at least.

Here's a possible solution to your problem:
import numpy as np
import matplotlib.pyplot as plt
from scipy import fft
from numpy import log, pi, e
# Signal setup
Fs = 150
Ts = 1.0 / Fs
t = np.arange(0, 1, Ts)
ff = 50
fwhl = 1
y = (2 / fwhl) * (log([2]) / pi)**0.5 * e**(-(4 * log([2]) * t**2) / fwhl**2)
# Plot original signal
plt.subplot(2, 1, 1)
plt.plot(t, y, 'k-')
plt.xlabel('time')
plt.ylabel('amplitude')
# Normalized FFT
plt.subplot(2, 1, 2)
n = len(y)
k = np.arange(n)
T = n / Fs
frq = k / T
freq = frq[range(n / 2)]
Y = np.fft.fft(y) / n
Y = Y[range(n / 2)]
plt.plot(freq, abs(Y), 'r-')
plt.xlabel('freq (Hz)')
plt.ylabel('|Y(freq)|')
plt.show()
With fwhl=1:
With fwhl=0.1:
You can see in the above graphs how the exponential & FFT plots varies when fwhl is close to 0

Bandwidth of an EEG signal

I'm trying to perform FFT of an EEG signal in Python, and then basing on the bandwidth determine whether it's alpha or beta signal. It looked fine, but the resulting plots are nothing like they should, the frequencies and magnitude values are not what I expected. Any help appreciated, here's the code:
from scipy.io import loadmat
import scipy
import numpy as np
from pylab import *
import matplotlib.pyplot as plt
eeg = loadmat("eeg_2013.mat");
eeg1=eeg['eeg1'][0]
eeg2=eeg['eeg2'][0]
fs = eeg['fs'][0][0]
fft1 = scipy.fft(eeg1)
f = np.linspace (fs,len(eeg1), len(eeg1), endpoint=False)
plt.figure(1)
plt.subplot(211)
plt.plot (f, abs (fft1))
plt.title ('Magnitude spectrum of the signal')
plt.xlabel ('Frequency (Hz)')
show()
plt.subplot(212)
fft2 = scipy.fft(eeg2)
f = np.linspace (fs,len(eeg2), len(eeg2), endpoint=False)
plt.plot (f, abs (fft2))
plt.title ('Magnitude spectrum of the signal')
plt.xlabel ('Frequency (Hz)')
show()
And the plots:

In order to get an array of the fft frequencies, you should use fftfreq; it gives you an array of frequencies to use as absciss:
from scipy.fftpack import fftfreq
eeg = loadmat("eeg_2013.mat");
eeg1=eeg['eeg1'][0]
eeg2=eeg['eeg2'][0]
fs = eeg['fs'][0][0]
fft1 = scipy.fft(eeg1)
f=fftfreq(eeg1.size,1/fs)
Sorry, I can't test this code in real conditions because you didn't post a data sample, but I hope this should work.
Concerning how to determine the bandwidth, as far as I understand, you want to get the fundamental frequency. There are different ways, more or less complicated whether your signal is noisy or not, ... In your case, you only want to know if the fundamental frequency f0 is in the range 8-13Hz (alpha) or 13-30Hz (beta); one very simple way is to compute the maximum of the fft in the range 8-13Hz: fft1[(f>8) & (f<13)].max() and if it's more than, say, 1000, it's an alpha wave, otherwise it's beta. If your signals are less similar, please post some examples of different kinds of samples and the result you would have, so that we can try more complicated algorithms.

If your sampling frequency is fs and you have N=len(eeg1) samples, then the fft procedure will, of course, return an array of N values. The first N/2 of them correspond to the frequency range 0..fs/2, the second half of the frequency corresponds to the mirrored frequency range -fs/2..0. For real input signals the mirrored half is just the complex conjugate of the positive half, so it can be disregarded in further analysis (but not in the inverse fft).
So essentially, you should format
f=linspace(0,N-1,N)*fs/N
Edit: or even more simple with minimal changes to the inital code
f = np.linspace (0,fs,len(eeg1), endpoint=False)
so f ranges from 0 to just before fs and disregard the second half of the fft result in the output:
plt.plot( f(0:N/2), abs( fft1(0:N/2) ) )
Added: You can use fftshift to exchange both halves, then the correct frequency range is
f = np.linspace (-fs/2,fs/2,len(eeg1), endpoint=False)

fft bandpass filter in python

What I try is to filter my data with fft. I have a noisy signal recorded with 500Hz as a 1d- array. My high-frequency should cut off with 20Hz and my low-frequency with 10Hz.
What I have tried is:
fft=scipy.fft(signal)
bp=fft[:]
for i in range(len(bp)):
if not 10<i<20:
bp[i]=0
ibp=scipy.ifft(bp)
What I get now are complex numbers. So something must be wrong. What? How can I correct my code?

It's worth noting that the magnitude of the units of your bp are not necessarily going to be in Hz, but are dependent on the sampling frequency of signal, you should use scipy.fftpack.fftfreq for the conversion. Also if your signal is real you should be using scipy.fftpack.rfft. Here is a minimal working example that filters out all frequencies less than a specified amount:
import numpy as np
from scipy.fftpack import rfft, irfft, fftfreq
time = np.linspace(0,10,2000)
signal = np.cos(5*np.pi*time) + np.cos(7*np.pi*time)
W = fftfreq(signal.size, d=time[1]-time[0])
f_signal = rfft(signal)
# If our original signal time was in seconds, this is now in Hz
cut_f_signal = f_signal.copy()
cut_f_signal[(W<6)] = 0
cut_signal = irfft(cut_f_signal)
We can plot the evolution of the signal in real and fourier space:
import pylab as plt
plt.subplot(221)
plt.plot(time,signal)
plt.subplot(222)
plt.plot(W,f_signal)
plt.xlim(0,10)
plt.subplot(223)
plt.plot(W,cut_f_signal)
plt.xlim(0,10)
plt.subplot(224)
plt.plot(time,cut_signal)
plt.show()

There's a fundamental flaw in what you are trying to do here - you're applying a rectangular window in the frequency domain which will result in a time domain signal which has been convolved with a sinc function. In other words there will be a large amount of "ringing" in the time domain signal due to the step changes you have introduced in the frequency domain. The proper way to do this kind of frequency domain filtering is to apply a suitable window function in the frequency domain. Any good introductory DSP book should cover this.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scipy FFT Frequency Analysis of very noisy signal - python

You are right there is something wrong. One needs to explictiy ask pandas for the zeroth column: Hn = np.fft.fft(Moisture_mean_x[0]) Else something wrong happen, which you can see by the fact that the FFT result was not symetric, which should be the case for real input.

An FFT is a filter bank. Just look for the magnitude peak only within the expected frequency range in the FFT result (instead of the entire result vector), and most of the other spectrum will essentially be filtered out.

It isn't necessary to filter the signal beforehand, because the FFT is a filter. Just skip those parts of the FFT that correspond to frequencies you know to contain a lot of noise - zero them out, or otherwise exclude them.

You should use a low pass filter, which should keep the larger periodic variations and smooth out some of the higher frequency stuff first. After that, then can do FFT to get the peaks. Here is a recipe for FIR filter typically used for this exact sort of thing.

Related

Odd result in python scipy FFT

Using scipy.fftpack.fft how to interprete numerical result of Fourier Transform

FFT normalization with numpy

Bandwidth of an EEG signal

fft bandpass filter in python

Categories

Resources