How to plot frequency data from a .wav file in Python? - python

I'm trying to plot the frequencies that make up the first 1 second of a voice recording.
My approach was to:
Read the .wav file as a numpy array containing time series data
Slice the array from [0:sample_rate-1], given that the sample rate has units of [samples/1 second], which implies that sample_rate [samples/seconds] * 1 [seconds] = sample_rate [samples]
Perform a fast fourier transform (fft) on the time series array in order to get the frequencies that make up that time-series sample.
Plot the the frequencies on the x-axis, and amplitude on the y-axis. The frequency domain would range from 0:(sample_rate/2) since the Nyquist Sampling Theorem tells us that the recording captured frequencies of at least two times the maximum frequency, i.e 2*max(frequency). I'll also slice the frequency output array in half since the output frequency data is symmetrical
Here is my implementation
import matplotlib.pyplot as plt
import numpy as np
from scipy.fftpack import fft
from scipy.io import wavfile
sample_rate, audio_time_series = wavfile.read(audio_path)
single_sample_data = audio_time_series[:sample_rate]
def fft_plot(audio, sample_rate):
N = len(audio) # Number of samples
T = 1/sample_rate # Period
y_freq = fft(audio)
domain = len(y_freq) // 2
x_freq = np.linspace(0, sample_rate//2, N//2)
plt.plot(x_freq, abs(y_freq[:domain]))
plt.xlabel("Frequency [Hz]")
plt.ylabel("Frequency Amplitude |X(t)|")
return plt.show()
fft_plot(single_sample_data, sample_rate)
This is the plot that it generated
However, this is incorrect, my spectrogram tells me I should have frequency peaks below the 5kHz range:
In fact, what this plot is actually showing, is the first second of my time series data:
Which I was able to debug by removing the absolute value function from y_freq when I plot it, and entering the entire audio signal into my fft_plot function:
...
sample_rate, audio_time_series = wavfile.read(audio_path)
single_sample_data = audio_time_series[:sample_rate]
def fft_plot(audio, sample_rate):
N = len(audio) # Number of samples
y_freq = fft(audio)
domain = len(y_freq) // 2
x_freq = np.linspace(0, sample_rate//2, N//2)
# Changed from abs(y_freq[:domain]) -> y_freq[:domain]
plt.plot(x_freq, y_freq[:domain])
plt.xlabel("Frequency [Hz]")
plt.ylabel("Frequency Amplitude |X(t)|")
return plt.show()
# Changed from single_sample_data -> audio_time_series
fft_plot(audio_time_series, sample_rate)
The code sample above produced, this plot:
Therefore, I think one of two things is going on:
The fft() function is not actually performing an fft on the time series data it is being given
The .wav file does not contain time series data to begin with
What could be the issue? Has anyone else experienced this?

I have replicated, essentially replicated, the code in the question and I don't see the problem the OP has described.
In [172]: %reset -f
...: import matplotlib.pyplot as plt
...: import numpy as np
...: from scipy.fftpack import fft
...: from scipy.io import wavfile
...:
...: sr, data = wavfile.read('sample.wav')
...: print(data.shape, sr)
...: signal = data[:sr,0]
...: Signal = fft(signal)
...: fig, (axt, axf) = plt.subplots(2, 1,
...: constrained_layout=1,
...: figsize=(11.8,3))
...: axt.plot(signal, lw=0.15) ; axt.grid(1)
...: axf.plot(np.abs(Signal[:sr//2]), lw=0.15) ; axf.grid(1)
...: plt.show()
sr, data = wavfile.read('sample.wav')
(268237, 2) 8000
Hence, I'm voting for closing the question because it is "Not reproducible or was caused by a typo".

Related

How to Extract the following Frequency-domain Features in Python?

Please feel free to point out any errors/improvements in the existing code
So this is a very basic question and I only have a beginner level understanding of signal processing. I have a 1.02 second accelerometer data sampled at 32000 Hz. I am looking to extract the following frequency domain features after having performed FFT in python -
Mean Freq, Median Freq, Power Spectrum Deformation, Spectrum energy, Spectral Kurtosis, Spectral Skewness, Spectral Entropy, RMSF (Root Mean Square Freq.), RVF (Root Variance Frequency), Power Cepstrum.
More specifically, I am looking for plots of these features as a final output.
The csv file containing data has four columns: Time, X Axis Value, Y Axis Value, Z Axis Value (The accelerometer is a triaxial one). So far on python, I have been able to visualize the time domain data, apply convolution filter to it, applied FFT and generated a Spectogram that shows an interesting shock
To Visualize Data
#Importing pandas and plotting modules
import numpy as np
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
#Reading Data
data = pd.read_csv('HelicalStage_Aug1.csv', index_col=0)
data = data[['X Value', 'Y Value', 'Z Value']]
date_rng = pd.date_range(start='1/8/2018', end='11/20/2018', freq='s')
#Plot the entire time series data and show gridlines
data.grid=True
data.plot()
enter image description here
Denoising
# Applying Convolution Filter
mylist = [1, 2, 3, 4, 5, 6, 7]
N = 3
cumsum, moving_aves = [0], []
for i, x in enumerate(mylist, 1):
cumsum.append(cumsum[i-1] + x)
if i>=N:
moving_ave = (cumsum[i] - cumsum[i-N])/N
#can do stuff with moving_ave here
moving_aves.append(moving_ave)
np.convolve(x, np.ones((N,))/N, mode='valid')
result_X = np.convolve(data[["X Value"]].values[:,0], np.ones((20001,))/20001, mode='valid')
result_Y = np.convolve(data[["Y Value"]].values[:,0], np.ones((20001,))/20001, mode='valid')
result_Z = np.convolve(data[["Z Value"]].values[:,0],
np.ones((20001,))/20001, mode='valid')
plt.plot(result_X-np.mean(result_X))
plt.plot(result_Y-np.mean(result_Y))
plt.plot(result_Z-np.mean(result_Z))
enter image description here
FFT and Spectogram
import numpy as np
import scipy as sp
import scipy.fftpack
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_csv('HelicalStage_Aug1.csv')
df = df.drop(columns="Time")
df.plot()
plt.title('Sensor Data as Time Series')
signal = df[['Y Value']]
signal = np.squeeze(signal)
Y = np.fft.fftshift(np.abs(np.fft.fft(signal)))
Y = Y[int(len(Y)/2):]
Y = Y[10:]
plt.figure()
plt.plot(Y)
enter image description here
plt.figure()
powerSpectrum, freqenciesFound, time, imageAxis = plt.specgram(signal, Fs= 32000)
plt.show()
enter image description here
If my code is correct and the generated FFT and spectrogram are good, then how can I graphically compute the previously mentioned frequency domain features?
I have tried doing the following for MFCC -
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
from scipy.io import wavfile
from python_speech_features import mfcc
from python_speech_features import logfbank
# Extract MFCC and Filter bank features
mfcc_features = mfcc(signal, Fs)
filterbank_features = logfbank(signal, Fs)
# Printing parameters to see how many windows were generated
print('\nMFCC:\nNumber of windows =', mfcc_features.shape[0])
print('Length of each feature =', mfcc_features.shape[1])
print('\nFilter bank:\nNumber of windows =', filterbank_features.shape[0])
print('Length of each feature =', filterbank_features.shape[1])
Visualizing filter bank features
#Matrix needs to be transformed in order to have horizontal time domain
mfcc_features = mfcc_features.T
plt.matshow(mfcc_features)
plt.title('MFCC')
enter image description here
enter image description here
I think your fft taking procedure is not correct, fft output is usually peak and when you are taking abs it should be one peak, as , probably you should change it to Y = np.fft.fftshift(np.abs(np.fft.fft(signal))) to Y=np.abs(np.fft.fftshift(signal)

Frequency domain of a sine wave with frequency 1000Hz

I'm starting DSP on Python and I'm having some difficulties:
I'm trying to define a sine wave with frequency 1000Hz
I try to do the FFT and find its frequency with the following piece of code:
import numpy as np
import matplotlib.pyplot as plt
sampling_rate = int(10e3)
n = int(10e3)
sine_wave = [100*np.sin(2 * np.pi * 1000 * x/sampling_rate) for x in range(0, n)]
s = np.array(sine_wave)
print(s)
plt.plot(s[:200])
plt.show()
s_fft = np.fft.fft(s)
frequencies = np.abs(s_fft)
plt.plot(frequencies)
plt.show()
So first plot makes sense to me.
Second plot (FFT) shows two frequencies:
i) 1000Hz, which is the one I set at the beggining
ii) 9000Hz, unexpectedly
freqeuncy domain
Your data do not respect Shannon criterion. you do not set a correct frequencies axis.
It's easier also to use rfft rather than fft when the signal is real.
Your code can be adapted like :
import numpy as np
import matplotlib.pyplot as plt
sampling_rate = 10000
n = 10000
signal_freq = 4000 # must be < sampling_rate/2
amplitude = 100
t=np.arange(0,n/sampling_rate,1/sampling_rate)
sine_wave = amplitude*np.sin(2 * np.pi *signal_freq*t)
plt.subplot(211)
plt.plot(t[:30],sine_wave[:30],'ro')
spectrum = 2/n*np.abs(np.fft.rfft(sine_wave))
frequencies = np.fft.rfftfreq(n,1/sampling_rate)
plt.subplot(212)
plt.plot(frequencies,spectrum)
plt.show()
Output :
There is no information loss, even if a human eye can be troubled by the temporal representation.

NumPy Fast Fourier transform (FFT) does not work on sine wave generated in Audacity

I am trying to use the NumPy library for Python to do some frequency analysis. I have two .wav files that both contain a 440 Hz sine wave. One of them I generated with the NumPy sine function, and the other I generated in Audacity. The FFT works on the Python-generated one, but does nothing on the Audacity one.
Here are links to the two files:
The non-working file: 440_audacity.wav
The working file: 440_gen.wav
This is the code I am using to do the Fourier transform:
import numpy as np
import matplotlib.pyplot as plt
import scipy.io.wavfile as wave
infile = "440_gen.wav"
rate, data = wave.read(infile)
data = np.array(data)
data_fft = np.fft.fft(data)
frequencies = np.abs(data_fft)
plt.subplot(2,1,1)
plt.plot(data[:800])
plt.title("Original wave: " + infile)
plt.subplot(2,1,2)
plt.plot(frequencies)
plt.title("Fourier transform results")
plt.xlim(0, 1000)
plt.tight_layout()
plt.show()
I have two 16-bit PCM .wav files, one from Audacity and one created with the NumPy sine function. The NumPy-generated one gives the following (correct) result, with the spike at 440Hz:
The one I created with Audacity, although the waveform appears identical, does not give any result on the Fourier transform:
I admit I am at a loss here. The two files should contain in effect the same data. They are encoded the same way, and the wave forms appear identical on the upper graph.
Here is the code used to generate the working file:
import numpy as np
import wave
import struct
import matplotlib.pyplot as plt
from operator import add
freq_one = 440.0
num_samples = 44100
sample_rate = 44100.0
amplitude = 12800
file = "440_gen.wav"
s1 = [np.sin(2 * np.pi * freq_one * x/sample_rate) * amplitude for x in range(num_samples)]
sine_one = np.array(s1)
nframes = num_samples
comptype = "NONE"
compname="not compressed"
nchannels = 1
sampwidth = 2
wav_file = wave.open(file, 'w')
wav_file.setparams((nchannels, sampwidth, int(sample_rate), nframes, comptype, compname))
for s in sine_one:
wav_file.writeframes(struct.pack('h', int(s)))
Let me explain why your code doesn't work. And why it works with [:44100].
First of all, you have different files:
440_gen.wav = 1 sec and 44100 samples (counts)
440_audacity.wav = 5 sec and 220500 samples (counts)
Since for 440_gen.wav in FFT you use the number of reference points N=44100 and the sample rate 44100, your frequency resolution is 1 Hz (bins are followed in 1 Hz increments).
Therefore, on the graph, each FFT sample corresponds to a delta equal to 1 Hz.
plt.xlim(0, 1000) just corresponds to the range 0-1000 Hz.
However, for 440_audacity.wav in FFT, you use the number of reference points N=220500 and the sample rate 44100. Your frequency resolution is 0.2 Hz (bins follow in 0.2 Hz increments) - on the graph, each FFT sample corresponds to a frequency in 0.2 Hz increments (min-max = +(-) 22500 Hz).
plt.xlim(0, 1000) just corresponds to the range 1000x0.2 = 0-200 Hz.
That is why the result is not visible - it does not fall within this range.
plt.xlim (0, 5000) will correct your situation and extend the range to 0-1000 Hz.
The solution [:44100] that jwalton brought in really only forces the FFT to use N = 44100. And this repeats the situation with the calculation for 440_gen.wav
A more correct solution to your problem is to use the N (Windows Size) parameter in the code and the np.fft.fftfreq() function.
Sample code below.
I also recommend an excellent article https://realpython.com/python-scipy-fft/
import numpy as np
import matplotlib.pyplot as plt
import scipy.io.wavfile as wave
N = 44100 # added
infile = "440_audacity.wav"
rate, data = wave.read(infile)
data = np.array(data)
data_fft = np.fft.fft(data, N) # added N
frequencies = np.abs(data_fft)
x_freq = np.fft.fftfreq(N, 1/44100) # added
plt.subplot(2,1,1)
plt.plot(data[:800])
plt.title("Original wave: " + infile)
plt.subplot(2,1,2)
plt.plot(x_freq, frequencies) # added x_freq
plt.title("Fourier transform results")
plt.xlim(0, 1000)
plt.tight_layout()
plt.show()
Since answering this question #Konyukh Fyodorov was able to provide a better and properly justified solution (below).
The following worked for me and produced the plots as expected. Unfortunately I cannot piece together quite why this works, but I'm sharing this solution in the hope it may assist someone else to make that leap.
import numpy as np
import matplotlib.pyplot as plt
import scipy.io.wavfile as wave
infile = "440_gen.wav"
rate, data = wave.read(infile)
data = np.array(data)
# Use first 44100 datapoints in transform
data_fft = np.fft.fft(data[:44100])
frequencies = np.abs(data_fft)
plt.subplot(2,1,1)
plt.plot(data[:800])
plt.title("Original wave: " + infile)
plt.subplot(2,1,2)
plt.plot(frequencies)
plt.title("Fourier transform results")
plt.xlim(0, 1000)
plt.tight_layout()
plt.show()

FFT spectrogram in python

I have python 3.4.
I transmitted a 2MHz (for example) frequency and received the cavitation over the time (until I stopped the measurement).
I want to get a spectrogram (cavitation vs frequency) and more interesting is a spectrogram of cavitation over the time of the sub-harmonic (1MHz) frequency.
The data is saved in sdataA (=cavitation), and t (=measurement time)
I tried to save fft in FFTA
FFTA = np.array([])
FFTA = np.fft.fft(dataA)
FFTA = np.append(FFTA, dataA)
I got real and complex numbers
Then I took only half (from 0 to 1MHz) and save the real and complex data.
nA = int(len(FFTA)/2)
yAre = FFTA[range(nA)].real
yAim = FFTA[range(nA)].imag
I tried to get the frequencies by:
FFTAfreqs = np.fft.fftfreq(len(yAre))
But it is totally wrong (I printed the data by print (FFTAfreqs))
I also plotted the data and again it's wrong:
plt.plot(t, FFTA[range(n)].real, 'b-', t, FFTA[range(n)].imag, 'r--')
plt.legend(('real', 'imaginary'))
plt.show()
How can I output a spectrogram of cavitation over the time of the sub-harmonic (1MHz) frequency?
EDIT:
Data example:
see a sample of 'dataA' and 'time':
dataA = [6.08E-04,2.78E-04,3.64E-04,3.64E-04,4.37E-04,4.09E-04,4.49E-04,4.09E-04,3.52E-04,3.24E-04,3.92E-04,3.24E-04,2.67E-04,3.24E-04,2.95E-04,2.95E-04,4.94E-04,4.09E-04,3.64E-04,3.07E-04]
time = [0.00E+00,4.96E-07,9.92E-07,1.49E-06,1.98E-06,2.48E-06,2.98E-06,3.47E-06,3.97E-06,4.46E-06,4.96E-06,5.46E-06,5.95E-06,6.45E-06,6.94E-06,7.44E-06,7.94E-06,8.43E-06,8.93E-06,9.42E-06]
EDIT II:
From #Martin example I tried the following code, please let me know if I did it right.
In the case that dataA and Time are saved as h5 files (or the data that I posted already)
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dfdata = pd.read_hdf("C:\\data_python\\DataA.h5")
dft = pd.read_hdf("C:\\data_python\\time.h5")
dft_cor = int((len(dft)-2)*4.96E-6) # calculating the measured time
fs = 2000000 #sampling frequency 2MHz
CHUNK = 10000
signal_time = dft_cor # seconds
def sine(freq,fs,secs):
data=dfdata
wave = np.sin(freq*2*np.pi*data)
return wave
a1 = sine(fs,fs,120)
a2 = sine(fs/2,fs,120)
signal = a1+a2
afft = np.abs(np.fft.fft(signal[0:CHUNK]))
freqs = np.linspace(0,fs,CHUNK)[0:int(fs/2)]
spectrogram_chunk = freqs/np.amax(freqs*1.0)
# Plot spectral analysis
plt.plot(freqs[0:1000000],afft[0:1000000]) # 0-1MHz
plt.show()
number_of_chunks = 1000
# Empty spectrogram
Spectrogram = np.zeros(shape = [CHUNK,number_of_chunks])
for i in range(number_of_chunks):
afft = np.abs(np.fft.fft(signal[i*CHUNK:(1+i)*CHUNK]))
freqs = np.linspace(0,fs,CHUNK)[0:int(fs/2)]
spectrogram_chunk = afft/np.amax(afft*1.0)
try:
Spectrogram[:,i]=spectrogram_chunk
except:
break
import cv2
Spectrogram = Spectrogram[0:1000000,:]
cv2.imshow('spectrogram',np.uint8(255*Spectrogram/np.amax(Spectrogram)))
cv2.waitKey()
cv2.destroyAllWindows()
It seems your problem is not in Python but in understanding what is Spectrogram.
Spectrogram is sequences of spectral analysis of a signal.
1) You need to cut your signal in CHUNKS.
2) Do spectral analysis of these CHUNKS and stick it together.
Example:
You have 1 second of audio recoding (44100 HZ sampling). That means the recording will have 1s * 44100 -> 44100 samples. You define CHUNK size = 1024 (for example).
For each chunk you will do FFT, and stick it together into 2D matrix (X axis - FFT of the CHUNK, Y axis - CHUNK number,). 44100 samples / CHUNK ~ 44 FFTs, each of the FFT covers 1024/44100~0.023 seconds of the signal
The bigger the CHUNK, the more accurate Spectrogram is, but less 'realtime'.
The smaller the CHUNK is, the less acurate is the Spectrogram, but you have more measurements as you measure frequencies 'more often'.
If you need 1MHZ - actually you cannot use anything higher than 1MHZ, you just take half of the resulting FFT array - and it doesnt matter which half, because 1MHZ is just the half of your sampling frequency, and the FFT is mirroring anything that is higher than 1/2 of sampling frequency.
About FFT, you dont want complex numbers. You want to do
FFT = np.abs(FFT) # Edit - I just noticed you use '.real', but I will keep it here
because you want real numbers.
Preparation for Spectrogram - example of Spectrogram
Audio Signal with 150HZ wave and 300HZ Wave
import numpy as np
import matplotlib.pyplot as plt
fs = 44100#sampling frequency
CHUNK = 10000
signal_time = 20 # seconds
def sine(freq,fs,secs):
data=np.arange(fs*secs)/(fs*1.0)
wave = np.sin(freq*2*np.pi*data)
return wave
a1 = sine(150,fs,120)
a2 = sine(300,fs,120)
signal = a1+a2
afft = np.abs(np.fft.fft(signal[0:CHUNK]))
freqs = np.linspace(0,fs,CHUNK)[0:int(fs/2)]
spectrogram_chunk = freqs/np.amax(freqs*1.0)
# Plot spectral analysis
plt.plot(freqs[0:250],afft[0:250])
plt.show()
number_of_chunks = 1000
# Empty spectrogram
Spectrogram = np.zeros(shape = [CHUNK,number_of_chunks])
for i in range(number_of_chunks):
afft = np.abs(np.fft.fft(signal[i*CHUNK:(1+i)*CHUNK]))
freqs = np.linspace(0,fs,CHUNK)[0:int(fs/2)]
#plt.plot(spectrogram_chunk[0:250],afft[0:250])
#plt.show()
spectrogram_chunk = afft/np.amax(afft*1.0)
#print(signal[i*CHUNK:(1+i)*CHUNK].shape)
try:
Spectrogram[:,i]=spectrogram_chunk
except:
break
import cv2
Spectrogram = Spectrogram[0:250,:]
cv2.imshow('spectrogram',np.uint8(255*Spectrogram/np.amax(Spectrogram)))
cv2.waitKey()
cv2.destroyAllWindows()
Spectral analysis of single CHUNK
Spectrogram

How to validate the downsampling is as intended

how to validate whether the down sampled output is correct. For example, I had make some example, however, I am not sure whether the output is correct or not?
Any idea on the validation
Code
import numpy as np
import matplotlib.pyplot as plt # For ploting
from scipy import signal
import mne
fs = 100 # sample rate
rsample=50 # downsample frequency
fTwo=400 # frequency of the signal
x = np.arange(fs)
y = [ np.sin(2*np.pi*fTwo * (i/fs)) for i in x]
f_res = signal.resample(y, rsample)
xnew = np.linspace(0, 100, f_res.size, endpoint=False)
#
# ##############################
#
plt.figure(1)
plt.subplot(211)
plt.stem(x, y)
plt.subplot(212)
plt.stem(xnew, f_res, 'r')
plt.show()
Plotting the data is a good first take at a verification. Here I made regular plot with the points connected by lines. The lines are useful since they give a guide for where you expect the down-sampled data to lie, and also emphasize what the down-sampled data is missing. (It would also work to only show lines for the original data, but lines, as in a stem plot, are too confusing, imho.)
import numpy as np
import matplotlib.pyplot as plt # For ploting
from scipy import signal
fs = 100 # sample rate
rsample=43 # downsample frequency
fTwo=13 # frequency of the signal
x = np.arange(fs, dtype=float)
y = np.sin(2*np.pi*fTwo * (x/fs))
print y
f_res = signal.resample(y, rsample)
xnew = np.linspace(0, 100, f_res.size, endpoint=False)
#
# ##############################
#
plt.figure()
plt.plot(x, y, 'o')
plt.plot(xnew, f_res, 'or')
plt.show()
A few notes:
If you're trying to make a general algorithm, use non-rounded numbers, otherwise you could easily introduce bugs that don't show up when things are even multiples. Similarly, if you need to zoom in to verify, go to a few random places, not, for example, only the start.
Note that I changed fTwo to be significantly less than the number of samples. Somehow, you need at least more than one data point per oscillation if you want to make sense of it.
I also remove the loop for calculating y: in general, you should try to vectorize calculations when using numpy.
The spectrum of the resampled signal should have a tone at the same frequency as the input signal just in a smaller nyquist bandwidth.
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
import scipy.fftpack as fft
fs = 100 # sample rate
rsample=50 # downsample frequency
fTwo=10 # frequency of the signal
n = np.arange(1024)
y = np.sin(2*np.pi*fTwo/fs*n)
y_res = signal.resample(y, len(n)/2)
Y = fft.fftshift(fft.fft(y))
f = -fs*np.arange(-512, 512)/1024
Y_res = fft.fftshift(fft.fft(y_res, 1024))
f_res = -fs/2*np.arange(-512, 512)/1024
plt.figure(1)
plt.subplot(211)
plt.stem(f, abs(Y))
plt.subplot(212)
plt.stem(f_res, abs(Y_res))
plt.show()
The tone is still at 10.
IF you down sample a signal both signals will still have the exact same value and a given time , so just loop through "time" and check that the values are the same. In your case you go from a sample rate of 100 to 50. Assuming you have 1 seconds worth of data from building your x from fs, then just loop through t = 0 to t=1 in 1/50'th increments and make sure that Yd(t) = Ys(t) where Yd d is the down sampled f and Ys is the original sampled frequency. Or to say it simply Yd(n) = Ys(2n) for n = 1,2,3,...n=total_samples-1.

Categories

Resources