Odd result in python scipy FFT - python

I have access scipy and want to create a FFT about simple Gaussian function which is exp(-t^2). And also it's well known that fourier transform of exp(−t^2) is √πexp(−π^2*k^2). But FFT of exp(-t^2) was not same as √πexp(−π^2*k^2).
I have tried the following code:
import scipy.fftpack as fft
from scipy import integrate
import numpy as np
import matplotlib.pyplot as plt
#FFT
N=int(1e+3)
T=0.01 #sample period
t = np.linspace(0,N*T, N)
h=np.exp(-t**2)
H_shift=2*np.abs(fft.fftshift(np.fft.fft(h)/N))
freq=fft.fftshift(fft.fftfreq(h.shape[0],t[1]-t[0]))
#Comparing FFT with fourier transform
def f(x):
return np.exp(-x**2)
def F(k):
return (np.pi**0.5)*np.exp((-np.pi**2)*(k**2))
plt.figure(num=1)
plt.plot(freq,F(freq),label=("Fourier Transform"))
plt.legend()
plt.figure(num=2)
plt.plot(freq,H_shift,label=("FFT"))
plt.legend()
plt.show()
#Checking Parseval's Theorm
S_h=integrate.simps(h**2,t)
#0.62665690150683084
S_H_s=integrate.simps(H_shift**2,freq)
#0.025215875346935791
S_F=integrate.simps(F(freq)**2,freq)
#1.2533141373154999
The graph I plotted is not same, also values of FFT do not follow Parseval's theorm. . It has to be S_H_s=S_h*2, but my result was not. I think that S_H_s which is result of FFT is wrong value Because of S_F=S_h*2.
Is there any problem in my code?? Help is greatly appreciated! Thanks in advance.

I suggest you plot your input signal h and verify that it looks like a Gaussian.
Spoiler alert: it doesn't, it is half a Gaussian!
By cutting it like this, you introduce a lot of high frequencies that you see in your plot.
To do this experiment correctly, follow this recipe to create your input signal:
t = np.linspace(-(N/2)*T,(N/2-1)*T, N)
h = np.exp(-t**2)
h = fft.ifftshift(h)
The ifftshift function serves to move the t=0 location to the leftmost array element. Note that t here is constructed carefully such that t=0 is exactly in the right place for this to work correctly, assuming an even-sized N. You can verify that fft.ifftshift(t)[0] is 0.0.

Related

How to sample from a given probability distribution?

I am plotting a histogram with the same probability distribution of particle in a ground state one dimensional box, using random numbers. But when compared with the original distribution, the values are getting cut at the top.
Here is the code
from numpy import*
from matplotlib.pyplot import*
lit=[]
def f(x):
return 2*(sin(pi*x)**2)
for i in range(0,10000):
x=random.uniform(0,1)
y=random.uniform(0,1)
if y<f(x):
lit.append(x)
l=linspace(0,1,10000)
hist(lit,bins=100,density=True)
plot(l,f(l))
show()
The graph produced is:
How to improve this code to match the original?
Your issue is that your density function f ranges in [0,2] but you draw y from [0,1].
Using y=np.random.uniform(0,2) will fix it.
But a better approach is to remap the uniform density to your desired function directly:
from scipy.interpolate import interp1d
l = np.linspace(0,1,100)
g = interp1d(l-np.sin(2*np.pi*l)/2/np.pi, l) # inverse for integral of f
# (not sure if analytical solution to above exists... so using interpolation instead)
lit = g(np.random.rand(10000))
In the general case where f may not have an analytical integral, you can use numerical integral too:
dl = l[1]
g = interp1d(np.cumsum(f(l))*dl, l)

PDF of the sum of Gaussian distributions using FFT

I am trying to derive the PDF of the sum of independent random variables. At first i would like to do this for a simple case: sum of Gaussian random variables.
I was surprised to see that I don't get a Gaussian density function when I sum an even number of gaussian random variables. I actually get:
which looks like two halfs of a Gaussian distribution.
On the other hand, when I sum an odd number of Gaussian distributions i get the right distribution:
below the code I used to produce the results above:
import numpy as np
from scipy.stats import norm
from scipy.fftpack import fft,ifft
import matplotlib.pyplot as plt
%matplotlib inline
a=10**(-15)
end=norm(0,1).ppf(a)
sample=np.linspace(end,-end,1000)
pdf=norm(0,1).pdf(sample)
plt.subplot(211)
plt.plot(np.real(ifft(fft(pdf)**2)))
plt.subplot(212)
plt.plot(np.real(ifft(fft(pdf)**3)))
Could someone help me understand why I get odd results for even sums of Gaussian distributions?
Even though your code creates a zero-mean Gaussian PDF:
sample=np.linspace(end,-end,1000)
pdf=norm(0,1).pdf(sample)
the FFT does not know about sample, and only sees pdf with samples at 0, 1, 2, 3, ... 999. The FFT expects the origin to be the first sample of the signal. To the FFT function, your PDF is not zero mean, but has a mean of 500.
Thus, what is going on here is that you are adding two PDFs with a 500 mean, leading to one with a 1000 mean. And because the FFT imposes a periodicity to the spatial domain signal, you are seeing the PDF exiting the graph on the right and coming back in on the left.
Adding 3 PDFs shifts the mean to 1500, which due to periodicity is the same as 500, meaning it ends up in the same place as the original PDF.
The solution is to shift the origin to the first sample for the FFT, and shift the result back:
from scipy.fftpack import fftshift, ifftshift
pdf2 = fftshift(ifft(fft(ifftshift(pdf))**2))
ifftshift shifts the signal so that the center sample ends up at the first sample, and fftshift shifts it back to where you wanted it for display.
But do note that the way you generate the PDF, the origin is not at a sample, and so the above will not work exactly. Instead, use:
sample=np.linspace(end,-end,1001)
pdf=norm(0,1).pdf(sample)
By picking 1001 samples instead of 1000, zero is exactly at the middle sample.
Use R!
library(ggplot2)
f <- function(n) {
x1 <- rnorm(n)
x2 <- rnorm(n)
X <- x1+x2
return(ds)
}
ds.list <- lapply(10^(2:5),f)
ds <- Reduce(rbind,ds.list)
ggplot(ds,aes(X,fill = n)) + geom_density(alpha = 0.5) + xlab("")
Here's the distribution plot:

Scipy FFT Frequency Analysis of very noisy signal

I have noisy data for which I want to calculate frequency and amplitude. The samples were collected every 1/100th sec. From trends, I believe frequency to be ~ 0.3
When I use numpy fft module, I end up getting very high frequency (36.32 /sec) which is clearly not correct. I tried to filter the data with pandas rolling_mean to remove the noise before fft, but that too didn't work.
import pandas as pd
from numpy import fft
import numpy as np
import matplotlib.pyplot as plt
Moisture_mean_x = pd.read_excel("signal.xlsx", header = None)
Moisture_mean_x = pd.rolling_mean(Moisture_mean_x, 10) # doesn't helps
Moisture_mean_x = Moisture_mean_x.dropna()
Moisture_mean_x = Moisture_mean_x -Moisture_mean_x.mean()
frate = 100. #/sec
Hn = fft.fft(Moisture_mean_x)
freqs = fft.fftfreq(len(Hn), 1/frate)
idx = np.argmax(np.abs(Hn))
freq_in_hertz = freqs[idx]
Can someone guide me how to fix this?
You are right there is something wrong. One needs to explictiy ask pandas for the zeroth column:
Hn = np.fft.fft(Moisture_mean_x[0])
Else something wrong happen, which you can see by the fact that the FFT result was not symetric, which should be the case for real input.
Seems like #tillsten already answered your question, but here is some additional confirmation. The first plot is your data (zero mean and I changed it to a csv). The second is the power spectral density and you can see a fat mass with a peak at ~0.3 Hz. I 'zoomed' in on the third plot to see if there was a second hidden frequency close to the main frequency.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
x = pd.read_csv("signal.csv")
x = np.array(x, dtype=float)[:,0]
x = x - np.mean(x)
fs = 1e2
f, Pxx = signal.welch(x, fs, nperseg=1024)
f_res, Pxx_res = signal.welch(x, fs, nperseg=2048)
plt.subplot(3,1,1)
plt.plot(x)
plt.subplot(3,1,2)
plt.plot(f, Pxx)
plt.xlim([0, 1])
plt.xlabel('frequency [Hz]')
plt.ylabel('PSD')
plt.subplot(3,1,3)
plt.plot(f_res, Pxx_res)
plt.xlim([0, 1])
plt.xlabel('frequency [Hz]')
plt.ylabel('PSD')
plt.show()
Hn = fft.fft(x)
freqs = fft.fftfreq(len(Hn), 1/fs)
idx = np.argmax(np.abs(Hn))
freq_in_hertz = freqs[idx]
print 'Main freq:', freq_in_hertz
print 'RMS amp:', np.sqrt(Pxx.max())
This prints:
Main freq: 0.32012805122
RMS amp: 0.0556044913489
An FFT is a filter bank. Just look for the magnitude peak only within the expected frequency range in the FFT result (instead of the entire result vector), and most of the other spectrum will essentially be filtered out.
It isn't necessary to filter the signal beforehand, because the FFT is a filter. Just skip those parts of the FFT that correspond to frequencies you know to contain a lot of noise - zero them out, or otherwise exclude them.
I hope this may help you.
https://scipy-cookbook.readthedocs.io/items/ButterworthBandpass.html
You should filter only the band around the expected frequency and improve the signal noise ratio before applying the FFT.
Edit:
Mark Ransom gave a smarter answer, if you have to do the FFT you can just cut off the noise after the transformation. It won't give a worse result than a filter would.
You should use a low pass filter, which should keep the larger periodic variations and smooth out some of the higher frequency stuff first. After that, then can do FFT to get the peaks. Here is a recipe for FIR filter typically used for this exact sort of thing.

scipy/numpy FFT on data from file

I looked into many examples of scipy.fft and numpy.fft. Specifically this example Scipy/Numpy FFT Frequency Analysis is very similar to what I want to do. Therefore, I used the same subplot positioning and everything looks very similar.
I want to import data from a file, which contains just one column to make my first test as easy as possible.
My code writes like this:
import numpy as np
import scipy as sy
import scipy.fftpack as syfp
import pylab as pyl
# Read in data from file here
array = np.loadtxt("data.csv")
length = len(array)
# Create time data for x axis based on array length
x = sy.linspace(0.00001, length*0.00001, num=length)
# Do FFT analysis of array
FFT = sy.fft(array)
# Getting the related frequencies
freqs = syfp.fftfreq(array.size, d=(x[1]-x[0]))
# Create subplot windows and show plot
pyl.subplot(211)
pyl.plot(x, array)
pyl.subplot(212)
pyl.plot(freqs, sy.log10(FFT), 'x')
pyl.show()
The problem is that I will always get my peak at exactly zero, which should not be the case at all. It really should appear at around 200 Hz.
With smaller range:
Still biggest peak at zero.
As already mentioned, it seems like your signal has a DC component, which will cause a peak at f=0. Try removing the mean with, e.g., arr2 = array - np.mean(array).
Furthermore, for analyzing signals, you might want to try plotting power spectral density.:
import matplotlib.pylab as plt
import matplotlib.mlab as mlb
Fs = 1./(d[1]- d[0]) # sampling frequency
plt.psd(array, Fs=Fs, detrend=mlb.detrend_mean)
plt.show()
Take a look at the documentation of plt.psd(), since there a quite a lot of options to fiddle with. For investigating the change of the spectrum over time, plt.specgram() comes in handy.

How to plot the integral of a signal as time goes by?

I have a time-dependent signal.
I wish to plot its integration over time with time being x-axis and integration value being y-axis.
Is there any Python way of doing this?
To be more specific:
I have a time array, time, and a signal array, signal. They are of same dimension.
I need to integrate signal over time with scipy.integrate.trapz().
Instead of getting the final integral, I wish to see the integral varying as time passes.
Try using scipy.integrate.cumtrapz() instead :
plt.plot(time[:-1], scipy.integrate.cumtrapz(signal, x=time))
plt.show()
It computes an array containing the cumulative integral values.
http://docs.scipy.org/doc/scipy-0.10.1/reference/generated/scipy.integrate.trapz.html
A slightly better answer uses the optional "initial" argument. Here is a complete example:
import scipy.integrate as it
import numpy as np
import matplotlib.pyplot as plt
t=np.linspace(0,1, 100)
y=t**2
y_int = it.cumtrapz( y , t, initial=0.0) # y_int is same size as t
plt.plot(t, y_int)
plt.show()
This avoids the strange indexing like time[:-1]

Categories

Resources