The peak suggestion, in the comments, does not find the peak that occurs most often. I need to find the frequency that occurs most often.
I need to find the dominant frequency in my Coefficient of Lift data. The frequency I am getting with the following code is quite large and not the dominant frequency. I know because the 2-D analysis is easy to analyze with a graph. It is sinusoidal. By dominant frequency, I mean the frequency of the signal with the most repeats.
#1/usr/bin/env python
import sys
import numpy
from numpy import sin
from math import pi
print("Hello World!")
data = numpy.loadtxt('CoefficientLiftData.dat', usecols= (0,3))
n = data.size
timestep = 0.000005
The peak data suggestion, in the comments, does not provide a count method to find how often a frequency occurs.
print(data)
fourier = numpy.fft.fft(data)
frequencies = numpy.fft.fftfreq(n, d=timestep)
positive_frequencies = frequencies[numpy.where(frequencies >= 0)]
magnitudes = abs(fourier[numpy.where(frequencies >= 0)])
peak_frequency = numpy.argmax(magnitudes)
print(peak_frequency)
Here is how to extract the dominate frequency from Coefficient of Lift data from, as an example, OpenFOAM.
#1/usr/bin/env python
import sys
import numpy as np
import scipy.fftpack as fftpack
from numpy import sin
from math import pi
import matplotlib.pyplot as plt
print("Hello World!")
N = 2500
Nev = 1000
data = np.loadtxt('CoefficientLiftData.dat', usecols= (0,3))
times = data[:,0]
length = int(len(times)/2)
forcez= data[:,1]
t = np.linspace(times[length], times[-1], 2500)
forcezint = np.interp(t, times, forcez)
fourier = fftpack.fft(forcezint[Nev-1:N-1])
frequencies = fftpack.fftfreq(forcezint[Nev-1:N-1].size, d=t[1]-t[0])
#print(frequencies)
freq = frequencies[np.argmax(np.abs(fourier))]
print(freq)
Related
I have a plot of 3 Dirac Delta functions after computing an fft using scipy. I want to find the 3 frequencies at which the delta dirac functions occur, therefore, where the y component(amplitude) is greater than 0, but how do I then find their corresponding x component (frequency). Is there a simpler way to print the dominating output frequencies of an fft?
I tried using the np.interp function but it accepts x values and returns y values. I tried inputting the reverse but it only returned the maximum frequency. I don't have an equation simply relating x and y as I've used an fft on my x and y values.
I found all y values above a certain level, y>100 in this case.
But how do I find their corresponding x values?
Fourier Plot
import matplotlib.pyplot as plt
import numpy as np
import math
import pandas as pd
from scipy.fft import fft, fftfreq
import scipy
sample_rate = 44100
duration = 5
N = sample_rate*duration
time = x = np.linspace(0, duration, N, endpoint=False)
amplitude = np.sin(7000*time*2*np.pi)*10 + np.cos(time*5000*(2*np.pi)) + np.sin(time*200*2*np.pi)*5
plt.plot(amplitude[:1000])
from scipy.fft import rfft, rfftfreq
yf = scipy.fft.rfft(amplitude)
xf = scipy.fft.rfftfreq(N, 1/sample_rate)
plt.plot(xf, np.abs(yf))
plt.xlim(0, 10000)
def check_amp(number):
if np.abs(number) > 100:
return True
return False
fft_outputs_iterator = filter(check_amp, yf)
fft_outputs = list(fft_outputs_iterator)
print(np.abs(fft_outputs))
I'm trying to determine the different time-periods present in a waveform (shown below) using findpeaks().
The dataset for which I'm finding the peaks and eventually time-period, is an Autocorrelation Plot (though it could have been any given plot). I generated it using the following code (which you can use as it is for your reference):
import json
import sys, os
import numpy as np
import pandas as pd
import glob
import pickle
from statsmodels.tsa.stattools import adfuller, acf, pacf
from scipy.signal import find_peaks, square
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
import matplotlib.pyplot as plt
#GENERATION OF A FUNCTION WITH DUAL SEASONALITY & NOISE
def white_noise(mu, sigma, num_pts):
""" Function to generate Gaussian Normal Noise
Args:
sigma: std value
num_pts: no of points
mu: mean value
Returns:
generated Gaussian Normal Noise
"""
noise = np.random.normal(mu, sigma, num_pts)
return noise
def signal_line_plot(input_signal: pd.Series, title: str = "", y_label: str = "Signal"):
""" Function to plot a time series signal
Args:
input_signal: time series signal that you want to plot
title: title on plot
y_label: label of the signal being plotted
Returns:
signal plot
"""
plt.plot(input_signal)
plt.title(title)
plt.ylabel(y_label)
plt.show()
t_week = np.linspace(1,480, 480)
t_weekend=np.linspace(1,192,192)
T=96 #Time Period
x_weekday = 10*square(2*np.pi*t_week/T, duty=0.7)+10 + white_noise(0, 1,480)
x_weekend = 2*square(2*np.pi*t_weekend/T, duty=0.7)+2 + white_noise(0,1,192)
x_daily_weekly = np.concatenate((x_weekday, x_weekend))
x_daily_weekly_long = np.concatenate((x_daily_weekly,x_daily_weekly,x_daily_weekly,x_daily_weekly,x_daily_weekly,x_daily_weekly,x_daily_weekly,x_daily_weekly,x_daily_weekly,x_daily_weekly))
signal_line_plot(x_daily_weekly_long)
signal_line_plot(x_daily_weekly_long[0:1000])
#x_daily_weekly_long is the final waveform on which I'm carrying out Autocorrelation
#PERFORMING AUTOCORRELATION:
import scipy.signal as signal
autocorr = signal.correlate(x_daily_weekly_long, x_daily_weekly_long, mode = "same")
lags = signal.correlation_lags(len(x_daily_weekly_long), len(x_daily_weekly_long), mode = "same")
#VISUALIZATION:
f = plt.figure()
f.set_figwidth(40)
f.set_figheight(10)
plt.plot(lags, autocorr)
My approach to determining the time-period is as follows:
#1) Finding peak indices
indices = find_peaks(autocorr.flatten())[0]
#2) Determining Period
diff = [(indices[i - 1] - x) for i, x in enumerate(indices)][1:]
short_period = abs(np.mean(diff))/fs
short_period #is the distance between all the individual peaks.
In this way, I'm able to only find the time-period (distance) between individual peaks. But my need is to determine the time-period for all the multiple periods that the data may contain.
Can anyone please help, how to determine all possible time-periods present in the data?
Please feel free to point out any errors/improvements in the existing code
So this is a very basic question and I only have a beginner level understanding of signal processing. I have a 1.02 second accelerometer data sampled at 32000 Hz. I am looking to extract the following frequency domain features after having performed FFT in python -
Mean Freq, Median Freq, Power Spectrum Deformation, Spectrum energy, Spectral Kurtosis, Spectral Skewness, Spectral Entropy, RMSF (Root Mean Square Freq.), RVF (Root Variance Frequency), Power Cepstrum.
More specifically, I am looking for plots of these features as a final output.
The csv file containing data has four columns: Time, X Axis Value, Y Axis Value, Z Axis Value (The accelerometer is a triaxial one). So far on python, I have been able to visualize the time domain data, apply convolution filter to it, applied FFT and generated a Spectogram that shows an interesting shock
To Visualize Data
#Importing pandas and plotting modules
import numpy as np
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
#Reading Data
data = pd.read_csv('HelicalStage_Aug1.csv', index_col=0)
data = data[['X Value', 'Y Value', 'Z Value']]
date_rng = pd.date_range(start='1/8/2018', end='11/20/2018', freq='s')
#Plot the entire time series data and show gridlines
data.grid=True
data.plot()
enter image description here
Denoising
# Applying Convolution Filter
mylist = [1, 2, 3, 4, 5, 6, 7]
N = 3
cumsum, moving_aves = [0], []
for i, x in enumerate(mylist, 1):
cumsum.append(cumsum[i-1] + x)
if i>=N:
moving_ave = (cumsum[i] - cumsum[i-N])/N
#can do stuff with moving_ave here
moving_aves.append(moving_ave)
np.convolve(x, np.ones((N,))/N, mode='valid')
result_X = np.convolve(data[["X Value"]].values[:,0], np.ones((20001,))/20001, mode='valid')
result_Y = np.convolve(data[["Y Value"]].values[:,0], np.ones((20001,))/20001, mode='valid')
result_Z = np.convolve(data[["Z Value"]].values[:,0],
np.ones((20001,))/20001, mode='valid')
plt.plot(result_X-np.mean(result_X))
plt.plot(result_Y-np.mean(result_Y))
plt.plot(result_Z-np.mean(result_Z))
enter image description here
FFT and Spectogram
import numpy as np
import scipy as sp
import scipy.fftpack
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_csv('HelicalStage_Aug1.csv')
df = df.drop(columns="Time")
df.plot()
plt.title('Sensor Data as Time Series')
signal = df[['Y Value']]
signal = np.squeeze(signal)
Y = np.fft.fftshift(np.abs(np.fft.fft(signal)))
Y = Y[int(len(Y)/2):]
Y = Y[10:]
plt.figure()
plt.plot(Y)
enter image description here
plt.figure()
powerSpectrum, freqenciesFound, time, imageAxis = plt.specgram(signal, Fs= 32000)
plt.show()
enter image description here
If my code is correct and the generated FFT and spectrogram are good, then how can I graphically compute the previously mentioned frequency domain features?
I have tried doing the following for MFCC -
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
from scipy.io import wavfile
from python_speech_features import mfcc
from python_speech_features import logfbank
# Extract MFCC and Filter bank features
mfcc_features = mfcc(signal, Fs)
filterbank_features = logfbank(signal, Fs)
# Printing parameters to see how many windows were generated
print('\nMFCC:\nNumber of windows =', mfcc_features.shape[0])
print('Length of each feature =', mfcc_features.shape[1])
print('\nFilter bank:\nNumber of windows =', filterbank_features.shape[0])
print('Length of each feature =', filterbank_features.shape[1])
Visualizing filter bank features
#Matrix needs to be transformed in order to have horizontal time domain
mfcc_features = mfcc_features.T
plt.matshow(mfcc_features)
plt.title('MFCC')
enter image description here
enter image description here
I think your fft taking procedure is not correct, fft output is usually peak and when you are taking abs it should be one peak, as , probably you should change it to Y = np.fft.fftshift(np.abs(np.fft.fft(signal))) to Y=np.abs(np.fft.fftshift(signal)
I'm starting DSP on Python and I'm having some difficulties:
I'm trying to define a sine wave with frequency 1000Hz
I try to do the FFT and find its frequency with the following piece of code:
import numpy as np
import matplotlib.pyplot as plt
sampling_rate = int(10e3)
n = int(10e3)
sine_wave = [100*np.sin(2 * np.pi * 1000 * x/sampling_rate) for x in range(0, n)]
s = np.array(sine_wave)
print(s)
plt.plot(s[:200])
plt.show()
s_fft = np.fft.fft(s)
frequencies = np.abs(s_fft)
plt.plot(frequencies)
plt.show()
So first plot makes sense to me.
Second plot (FFT) shows two frequencies:
i) 1000Hz, which is the one I set at the beggining
ii) 9000Hz, unexpectedly
freqeuncy domain
Your data do not respect Shannon criterion. you do not set a correct frequencies axis.
It's easier also to use rfft rather than fft when the signal is real.
Your code can be adapted like :
import numpy as np
import matplotlib.pyplot as plt
sampling_rate = 10000
n = 10000
signal_freq = 4000 # must be < sampling_rate/2
amplitude = 100
t=np.arange(0,n/sampling_rate,1/sampling_rate)
sine_wave = amplitude*np.sin(2 * np.pi *signal_freq*t)
plt.subplot(211)
plt.plot(t[:30],sine_wave[:30],'ro')
spectrum = 2/n*np.abs(np.fft.rfft(sine_wave))
frequencies = np.fft.rfftfreq(n,1/sampling_rate)
plt.subplot(212)
plt.plot(frequencies,spectrum)
plt.show()
Output :
There is no information loss, even if a human eye can be troubled by the temporal representation.
I have used this code how to extract frequency associated with fft values in python and added a firwin filter to detect a short high frequency signal. The signal is 1 second long and occurs at random in a wav audio file of x-seconds. My code looks as follows:
from scipy import signal
from scipy.io import wavfile
from scipy.fftpack import fft, ifft,fftfreq
import matplotlib.pyplot as plt
import wave
import numpy as np
import sys
import struct
frate,data = wavfile.read('SoundPeep.wav')
#print(frate)
b = signal.firwin(101, cutoff=900, fs= frate, pass_zero=False)
data = signal.lfilter(b, [1.0], data)
w = np.fft.fft(data)
freqs = np.fft.fftfreq(len(w))
#print(len(w)/frate)
#print(w.min(), w.max())
# (-0.5, 0.499975)
# Find the peak in the coefficients
idx = np.argmax(np.abs(w))
winner = np.argwhere(np.abs(w) == np.amax(np.abs(w)))
freq = freqs[idx]
freq_in_hertz = abs(freq * frate)
print("HZ")
print(freq_in_hertz)
occurence = idx/frate
print(occurence)
The code works well at detecting the peak HZ frequency. My problem is that I want to calculate where the high frequency signal begins in the audio-file. I thought it could be done simply by dividing the index(idx) by the framerate of the recording but this does not seem to work.
you could make a sonogram by using a short time fourier transform to see how frequency changes over time. the stft returns 3 values: timestamp, amplitude, and frequency. If the amplitude isnt relevant, you can just use peak detection from there.