I am trying to plot the fft of a wav file, I have successfully completed it using the regular fft but I wanted to experiment with rfft as my application was to perform this in music. When I try to plot xf and yf (figure 2) I run into an issue where xf is half the length of yf and I can't figure out why, I assume its due to the negative frequencies missing but I thought changing both function calls to rfft and rfftfreq would handle it.
import numpy as np
import soundfile as sf
import matplotlib.pyplot as plt
square = 'square.wav'
sine = 'sine.wav'
k = '1Khz.wav'
cello = 'cello.wav'
data, fs = sf.read(k)
#Plot the Signal
N = len(data)
T = 1.0/fs
x = np.linspace(0, (N*T), N)
plt.plot(x, data)
count = 0
yf = np.fft.rfft(data)
xf = np.fft.rfftfreq(yf.size, d=T)
plt.plot(xf, yf)
The sizes used for numpy.fft.rfft and numpy.fft.rfftfreq need to match. As such you should use your data.size rather yf.size (since the size of yf is already reduced by not including the negative frequencies) as argument to rfftfreq:
yf = np.fft.rfft(data)
xf = np.fft.rfftfreq(data.size, d=T)
Finally note that as you plot yf with plt.plot(xf, yf) you would get a warning about the imaginary part being lost. If you are interested in plotting the magnitude of the frequency spectrum, you should rather use plt.plot(xf, abs(yf)).
You need to convert the frequencies to the sample rate. See https://stackoverflow.com/a/27191172/7919597 or the doc of rfftfreq:
signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5, -3, 4], dtype=float)
fourier = np.fft.rfft(signal)
n = signal.size
sample_rate = 100
freq = np.fft.fftfreq(n, d=1./sample_rate)
freq = np.fft.rfftfreq(n, d=1./sample_rate)
I've been trying to get the fourier transform of the data in file S1L1E here https://github.com/gergobes/data-for-fft/blob/c0b664379c0eeab04abdc541e34d6e636e841eb0/S1L1E. First column is time, 2nd column is amplitude of first wave and 3rd column is the amplitude of another wave. The code I tried is;
import matplotlib.pyplot as plt
import numpy as np
from scipy.fft import fft, fftfreq
data = np.loadtxt("S1L1E")
time = data[:,0]
amp_left = data[:,1]
amp_right = data[:,2]
plt.plot(time, amp_left)
plt.plot(time, amp_right)
# fft attempt
samplerate = 5000
duration = time[-1]
N = int(samplerate * duration)
x = time
y = amp_left
yf = fft(y)
xf = fftfreq(len(y))
plt.plot(xf, abs(yf))
I tried for the first wave but got only a spike at 0. What am I doing wrong? It's my first time trying a fft so I'm kinda lost here. I would appreciate any help.
This is not a duplicate question since other answers only explain how to plot the cross-correlation function and do not explain how you can get the time difference.
Given a sin signal and shifted version, we should be able to get the time delay between them.
I have created a sin signal and shifted it by t_d=0.05. The following is my code and its output:
import numpy as np
import matplotlib.pyplot as plt
fs = 1000
x = np.linspace(0, 1, fs)
f = 5
t_shift = 0.05
y = np.sin(2*np.pi*f*x)
y_shifted = np.sin(2*np.pi*f*(x-t_shift))
fig, ax = plt.subplots()
ax.plot(x, y, x, y_shifted)
By normalizing signals and applying numpy.correlate we get the following:
y_norm = (y-y.mean())/y.std()
y_shifted_norm = (y_shifted - y_shifted.mean())/y_shifted.std()
cc = np.correlate(y_norm, y_shifted_norm, 'full')
fig, ax = plt.subplots()
ax.plot(range(len(cc)), cc)
From the indices of cross-correlation function, how can I get t_shift=0.05?
#Sepide. It seems to me as if you are trying to maximise the correlation between the signal y and a shifted version of y_shifted. This might be accomplished using np.correlate() but it seems nontrivial indeed to recover the time shifts in the signals. In the solution below I manually shift the time series and compute the correlation coefficient using np.corrcoef. As soon as this Pearson correlation coefficient equals 1, the two signals are aligned.
import numpy as np
import matplotlib.pyplot as plt
# Setting
fs = 1000
x = np.linspace(0, 1, fs)
f = 5
t_shift = 0.05
t_step = 1/fs
# Data
y = np.sin(2*np.pi*f*x)
y_shifted = np.sin(2*np.pi*f*(x-t_shift))
# Compute correlation
MaxTimeShift = 200
CorrelationList = np.empty((MaxTimeShift,1));
CorrelationList[:] = np.NaN
# Compute correlation for various shifts
for iter in range(MaxTimeShift):
CorrelationList[iter] = np.corrcoef( y[0:801].T, y_shifted[iter:(801+iter)].T)[0,1]
# Plot 1
plt.plot(x, y, x, y_shifted)
# Plot 2
ShiftList = t_step*np.arange(MaxTimeShift)
plt.plot(ShiftList, CorrelationList)
plt.title("Correlation coefficient")
print("The time shift between the signals is: ", ShiftList[np.argmax(CorrelationList)])
I want to plot a power spectrum from my data set (array of about 2000 values, the data is recorded every minute).
I've gotten so far as:
y= np.fft.fft(data)
abs = np.abs(y) #absolute value
p = np.square(abs) #power
but am confused about setting the frequency.
I've tried using freqs = np.fft.fftfreq(len(y)), but when I plot the result it looks like, which can't be right.
What am I doing wrong?
Here is an example to plot the power spectrum:
import matplotlib.pyplot as plt
import numpy as np
t = np.linspace(0,2000,200)
data = 2 * np.sin(2*np.pi *60*t) + 2 * np.sin(2*np.pi *42*t)
spectrum = np.fft.fft(data)
power_spectrum = np.square(np.abs(spectrum))
fig, ax = plt.subplots()
ax.plot(np.arange(len(power_spectrum)), power_spectrum)
I am trying to do an FFT on a .wav file that contains only a 1 kHz sin wave. When I plot the result, I expect the peak to be at the fundamental (1 kHz) but instead, I see the peak at what seems to be the 3rd harmonic (3 kHz). I have tried 2 other .wav files at 440 Hz and 2 kHz with the same result. I used a frequency counter to verify the .wav files contain the frequencies I expect.
For comparison, I use the commented code below to generate and plot a sin function which displays correctly.
import matplotlib.pyplot as plt
import numpy as np
import wave, struct
sound_file = wave.open('c:\downloads\SineWave_1000Hz.wav', 'r')
file_length = sound_file.getnframes()
data = sound_file.readframes(file_length)
data = struct.unpack('{n}h'.format(n=file_length), data)
data = np.array(data)
#x = np.linspace(0.0, 1, 600)
#y = np.sin(50.0 * 2.0*np.pi*x)
#yf = fft(y)
yf = fft(data)
plt.xlim(0, 4000)
plt.plot( np.abs(yf))
Do you know how to delete so much noise from the FFT?
Here is my code of FFT:
import numpy as np
fft1 = (Bx[51:-14])
fft2 = (By[1:-14])
# Loop for FFT data
for dataset in [fft1]:
dataset = np.asarray(dataset)
psd = np.abs(np.fft.fft(dataset))**2
freq = np.fft.fftfreq(dataset.size, float(300)/dataset.size)
plt.semilogy(freq[freq>0], psd[freq>0]/dataset.size**2, color='r')
for dataset2 in [fft2]:
dataset2 = np.asarray(dataset2)
psd2 = np.abs(np.fft.fft(dataset2))**2
freq2 = np.fft.fftfreq(dataset2.size, float(300)/dataset2.size)
plt.semilogy(freq2[freq2>0], psd2[freq2>0]/dataset2.size**2, color='b')
What I get:
What I need:
Any ideas? Welch does not work, so as you can see, I don't want to smooth my chart, but erase so much noise to the level which is presented on the second picture.
This is what Welch do:
and a bit of code:
freqs, psd = scipy.signal.welch(dataset, fs=300, window='hamming')
Updated Welch:
A bit of code:
# Loop for FFT data
for dataset in [fft1]:
dataset = np.asarray(dataset)
freqs, psd = welch(dataset, fs=266336/300, window='hamming', nperseg=512)
plt.semilogy(freqs, psd/dataset.size**2, color='r')
for dataset2 in [fft2]:
dataset2 = np.asarray(dataset2)
freqs2, psd2 = welch(dataset2, fs=266336/300, window='hamming', nperseg=512)
plt.semilogy(freqs2, psd2/dataset2.size**2, color='b')
As you can see Welch is well configurated, it shows 60 Hz electricity line, and harmonic modes. It is almost good, but it smoothed completely my plot. See graph two which is desired. Btw. y scale is wrong at Welch plot, but it is just a case of power data to the two.
I have changed to nperseg=8192 and it worked. Look at the results.
Here is an example that shows how to use nperseg to control the frequency resolution vs. noise reduction tradeoff:
Setting nperseg to the length of the signal is more or less equivalent to using the FFT without any averaging.
Here is the code to generate this image:
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt
plt.figure(figsize=[8, 12])
n = 2**21
fs = 887
# example data
x = np.random.randn(n)
x += np.sin(np.cumsum(0.42 + np.random.randn(n) * 0.01)) * 5
x = signal.lfilter([1, 0.5], 2, x)
plt.subplot(3, 2, 1)
plt.semilogy(np.abs(np.fft.fft(x)[:n//2])**2 / n**2, label='FFT')
for i, nperseg in enumerate([128, 512, 8192, 65536, n]):
plt.subplot(3, 2, i+2)
f, psd = signal.welch(x, fs=fs, window='hamming', nperseg=nperseg, noverlap=0)
plt.semilogy(f, psd, label='nperseg={}'.format(nperseg))