Unable to distinguish four cosines from a FFT

Unable to distinguish four cosines from a FFT - python

I have four cosines with frequencies 400e-3, 500e-3, 600e-3 and 700e-3 and I am trying to do the FFT of them but under the time I need, I cannot distinguish the four. Is there a way to distinguish the peaks without changing the tmax time of 1.76 and the frequencies?
import numpy as np
import scipy.fftpack
from scipy.fftpack import fftfreq
from scipy.fft import fft
import matplotlib.pyplot as plt
t = np.linspace(0,1.76,2400)
f = [400e-3, 500e-3, 600e-3, 700e-3] # these are the frequencies
yy = 0
for i in f:
y = 0.5*np.cos(2*np.pi*i*t)
yy = yy + y
plt.figure(0)
plt.plot(t, yy)
f = fftfreq(len(t), np.diff(t)[0])
yf = fft(yy)
plt.figure(1)
plt.plot(f[:t.size//2], np.abs(yf[:t.size//2]))
plt.show()
Here are the results:

The solution was to increase tmax of
t = np.linspace(0,1.76,2400)
i.e. 1.76. FFT makes bins the size of 1/tmax and the small tmax is, the bigger the bins are leading to less resolution.

Related

Python fourier transform for amplitude vs time data

I've been trying to get the fourier transform of the data in file S1L1E here https://github.com/gergobes/data-for-fft/blob/c0b664379c0eeab04abdc541e34d6e636e841eb0/S1L1E. First column is time, 2nd column is amplitude of first wave and 3rd column is the amplitude of another wave. The code I tried is;
import matplotlib.pyplot as plt
import numpy as np
from scipy.fft import fft, fftfreq
data = np.loadtxt("S1L1E")
time = data[:,0]
amp_left = data[:,1]
amp_right = data[:,2]
plt.plot(time, amp_left)
plt.plot(time, amp_right)
plt.show()
# fft attempt
samplerate = 5000
duration = time[-1]
N = int(samplerate * duration)
x = time
y = amp_left
yf = fft(y)
xf = fftfreq(len(y))
plt.plot(xf, abs(yf))
plt.show()
I tried for the first wave but got only a spike at 0. What am I doing wrong? It's my first time trying a fft so I'm kinda lost here. I would appreciate any help.

Recover the time shift from nympy.correlate result in Python

This is not a duplicate question since other answers only explain how to plot the cross-correlation function and do not explain how you can get the time difference.
Given a sin signal and shifted version, we should be able to get the time delay between them.
I have created a sin signal and shifted it by t_d=0.05. The following is my code and its output:
import numpy as np
import matplotlib.pyplot as plt
fs = 1000
x = np.linspace(0, 1, fs)
f = 5
t_shift = 0.05
y = np.sin(2*np.pi*f*x)
y_shifted = np.sin(2*np.pi*f*(x-t_shift))
fig, ax = plt.subplots()
ax.plot(x, y, x, y_shifted)
plt.show()
By normalizing signals and applying numpy.correlate we get the following:
y_norm = (y-y.mean())/y.std()
y_shifted_norm = (y_shifted - y_shifted.mean())/y_shifted.std()
cc = np.correlate(y_norm, y_shifted_norm, 'full')
fig, ax = plt.subplots()
ax.plot(range(len(cc)), cc)
plt.show()
Question
From the indices of cross-correlation function, how can I get t_shift=0.05?

#Sepide. It seems to me as if you are trying to maximise the correlation between the signal y and a shifted version of y_shifted. This might be accomplished using np.correlate() but it seems nontrivial indeed to recover the time shifts in the signals. In the solution below I manually shift the time series and compute the correlation coefficient using np.corrcoef. As soon as this Pearson correlation coefficient equals 1, the two signals are aligned.
import numpy as np
import matplotlib.pyplot as plt
# Setting
fs = 1000
x = np.linspace(0, 1, fs)
f = 5
t_shift = 0.05
t_step = 1/fs
# Data
y = np.sin(2*np.pi*f*x)
y_shifted = np.sin(2*np.pi*f*(x-t_shift))
# Compute correlation
MaxTimeShift = 200
CorrelationList = np.empty((MaxTimeShift,1));
CorrelationList[:] = np.NaN
# Compute correlation for various shifts
for iter in range(MaxTimeShift):
CorrelationList[iter] = np.corrcoef( y[0:801].T, y_shifted[iter:(801+iter)].T)[0,1]
# Plot 1
plt.figure(1)
plt.plot(x, y, x, y_shifted)
plt.show()
# Plot 2
plt.figure(2)
ShiftList = t_step*np.arange(MaxTimeShift)
plt.plot(ShiftList, CorrelationList)
plt.title("Correlation coefficient")
plt.show()
print("The time shift between the signals is: ", ShiftList[np.argmax(CorrelationList)])

python - frequency of power spectrum

I want to plot a power spectrum from my data set (array of about 2000 values, the data is recorded every minute).
I've gotten so far as:
y= np.fft.fft(data)
abs = np.abs(y) #absolute value
p = np.square(abs) #power
but am confused about setting the frequency.
I've tried using freqs = np.fft.fftfreq(len(y)), but when I plot the result it looks like, which can't be right.
What am I doing wrong?

Here is an example to plot the power spectrum:
import matplotlib.pyplot as plt
import numpy as np
t = np.linspace(0,2000,200)
data = 2 * np.sin(2*np.pi *60*t) + 2 * np.sin(2*np.pi *42*t)
spectrum = np.fft.fft(data)
power_spectrum = np.square(np.abs(spectrum))
fig, ax = plt.subplots()
ax.plot(np.arange(len(power_spectrum)), power_spectrum)
plt.show()

How to plot librosa STFT output properly

I'm creating a sine wave of 100Hz and trying to plot it's stft :
import scipy.io
import numpy as np
import librosa
import librosa.display
#%matplotlib notebook
import matplotlib.pyplot as plt
A = 1 # Amplitude
f0 = 100 # frequency
Fs = f0 * 800 # Sampling frequency
t = np.arange(Fs) / float(Fs)
X = np.sin(2*np.pi*t*f0)
plt.plot(t, X)
plt.xlabel("Time")
plt.ylabel("Amplitude")
plt.show()
D = np.abs(librosa.stft(X))
librosa.display.specshow(librosa.amplitude_to_db(D,ref=np.max),y_axis='log', x_axis='time')
I was expecting a single line at 100Hz instead.
Also, how can I plot Frequency(X-axis) vs Amplitude(Y-axis) graph to see a peak at 100Hz?

You need to pass the sample rate to specshow, using the sr keyword argument. Otherwise it will default to 22kHz, which will give wrong results.
D = np.abs(librosa.stft(X))
db = librosa.amplitude_to_db(D,ref=np.max)
librosa.display.specshow(db, sr=Fs, y_axis='log', x_axis='time')

How to use Python to draw a normal probability plot by using certain column data in dataFrame

I have a Data Frame that contains two columns named, "thousands of dollars per year", and "EMPLOY".
I create a new variable in this data frame named "cubic_Root" by computing the data in df['thousands of dollars per year']
df['cubic_Root'] = -1 / df['thousands of dollars per year'] ** (1. / 3)
The data in df['cubic_Root'] like that:
ID cubic_Root
1 -0.629961
2 -0.405480
3 -0.329317
4 -0.480750
5 -0.305711
6 -0.449644
7 -0.449644
8 -0.480750
Now! How can I draw a normal probability plot by using the data in df['cubic_Root'].

You want the "Probability" Plots.
So for a single plot, you'd have something like below.
import scipy.stats
import numpy as np
import matplotlib.pyplot as plt
# 100 values from a normal distribution with a std of 3 and a mean of 0.5
data = 3.0 * np.random.randn(100) + 0.5
counts, start, dx, _ = scipy.stats.cumfreq(data, numbins=20)
x = np.arange(counts.size) * dx + start
plt.plot(x, counts, 'ro')
plt.xlabel('Value')
plt.ylabel('Cumulative Frequency')
plt.show()
If you want to plot a distribution, and you know it, define it as a function, and plot it as so:
import numpy as np
from matplotlib import pyplot as plt
def my_dist(x):
return np.exp(-x ** 2)
x = np.arange(-100, 100)
p = my_dist(x)
plt.plot(x, p)
plt.show()
If you don't have the exact distribution as an analytical function, perhaps you can generate a large sample, take a histogram and somehow smooth the data:
import numpy as np
from scipy.interpolate import UnivariateSpline
from matplotlib import pyplot as plt
N = 1000
n = N/10
s = np.random.normal(size=N) # generate your data sample with N elements
p, x = np.histogram(s, bins=n) # bin it into n = N/10 bins
x = x[:-1] + (x[1] - x[0])/2 # convert bin edges to centers
f = UnivariateSpline(x, p, s=n)
plt.plot(x, f(x))
plt.show()
You can increase or decrease s (smoothing factor) within the UnivariateSpline function call to increase or decrease smoothing. For example, using the two you get:
Probability density Function (PDF) of inter-arrival time of events.
import numpy as np
import scipy.stats
# generate data samples
data = scipy.stats.expon.rvs(loc=0, scale=1, size=1000, random_state=123)
A kernel density estimation can then be obtained by simply calling
scipy.stats.gaussian_kde(data,bw_method=bw)
where bw is an (optional) parameter for the estimation procedure. For this data set, and considering three values for bw the fit is as shown below
# test values for the bw_method option ('None' is the default value)
bw_values = [None, 0.1, 0.01]
# generate a list of kde estimators for each bw
kde = [scipy.stats.gaussian_kde(data,bw_method=bw) for bw in bw_values]
# plot (normalized) histogram of the data
import matplotlib.pyplot as plt
plt.hist(data, 50, normed=1, facecolor='green', alpha=0.5);
# plot density estimates
t_range = np.linspace(-2,8,200)
for i, bw in enumerate(bw_values):
plt.plot(t_range,kde[i](t_range),lw=2, label='bw = '+str(bw))
plt.xlim(-1,6)
plt.legend(loc='best')
Reference:
Python: Matplotlib - probability plot for several data set
how to plot Probability density Function (PDF) of inter-arrival time of events?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Unable to distinguish four cosines from a FFT - python

The solution was to increase tmax of t = np.linspace(0,1.76,2400) i.e. 1.76. FFT makes bins the size of 1/tmax and the small tmax is, the bigger the bins are leading to less resolution.

Related

Python fourier transform for amplitude vs time data

Recover the time shift from nympy.correlate result in Python

python - frequency of power spectrum

How to plot librosa STFT output properly

How to use Python to draw a normal probability plot by using certain column data in dataFrame

Categories

Resources