Vector and RMS averaging in FFT - python

I have a data array on which I have performed an FFT. This is the code that I have applied.
import numpy as np
# "data" is a column vector on which FFT needs to be performed
# N = No. of points in "data"
# dt = time interval between two corresponding data points
FFT_data = np.fft.fft(data) # Complex values
FFT_data_real = 2/N*abs(FFT_data) # Absolute values
However, I went through following link: https://www.dsprelated.com/showarticle/1159.php
Here it says, to enhance the SNR we can apply "RMS-averaged FFT" and "Vector Averaged FFT".
Can somebody please let me know how to we go about doing these two methodologies in Python or is there any documentation/links to which we can refer ?

As your reference indicates:
If you take the square root of the average of the squares of your sample spectra, you are doing RMS Averaging. Another alternative is Vector Averaging in which you average the real and complex components separately.
Obviously to be able to perform either averaging you'd need to have more than a single data set to average. In your example code, you have a single column vector data. Let's assume you have multiple such column vectors arranged as a 2D NxM matrix, where N is the number of points per dataset and M is the number of datasets. Since the datasets are stored in columns, when computing the FFT you will need to specify the parameter axis=0 to compute the FFT along columns.
RMS-averaged FFT
As the name suggests, for this method you need to take the square-root of the mean of the squared amplitudes. Since the different sets are stored in columns, you'd need to do the average along the axis 1 (the other axis than the one used for the FFT).
FFT_data = np.fft.fft(data, axis=0) # Complex values
FFT_data_real = 2/N*abs(FFT_data) # Absolute values
rms_averaged = np.sqrt(np.mean(FFT_data_real**2, axis=1))
Vector Averaged FFT
In this case you need to obtain the real and imaginary components of the FFT data, then compute the average on each separately:
FFT_data = np.fft.fft(data, axis=0) # Complex values
real_part_avg = 2/N*np.mean(np.real(FFT_data),axis=1)
imag_part_avg = 2/N*np.mean(np.imag(FFT_data),axis=1)
vector_averaged = np.abs(real_part_avg+1j*imag_part_avg)
Note that I've kept the 2/N scaling you had for the absolute values.
But what can I do if I really only have one dataset?
If that dataset happens to be stationary and sufficiently large then you could break down your dataset into smaller blocks. This can be done by reshaping your vector into an NxM matrix with the following:
data = data.reshape(N,M)
...
Then you could perform the averaging with either method.

Related

Getting variance values for random samples generated from a standard normal distribution using numpy

I have a function that gives me probability distributions for each class, in terms of a matrix corresponding to mean values and another matrix corresponding to variance values. For example, if I had four classes then I would have the following outputs:
y_means = [1,2,3,4]
y_variance = [0.01,0.02,0.03,0.04]
I need to do the following calculation to the mean values to continue with the rest of my program:
y_means = np.array(y_means)
y_means = np.reshape(y_means,(y_means.size,1))
A = np.random.randn(10,y_means.size)
y_means = np.matmul(A,y_means)
Here, I have used the numpy.random.randn function to generate random samples from a standard normal distribution, and then multiply this with the matrix with the mean value to obtain a new output matrix. The dimension of the output matrix would then be of the size (10 x 1).
I need to do a similar calculation such that my output_variances will also be a (10 x 1) matrix. But it is not meaningful to multiply the variances in the same way with random samples from a standard normal distribution, because this would result in negative values as well. This is undesirable because my ultimate aim would be to create a normal distribution with these mean values and their corresponding variance values using:
torch.distributions.normal.Normal(loc=y_means, scale=y_variance)
So my question is if there is any method by which I get a variance value for each random sample generated by numpy.random.randn? Because then the multplication of such a matrix would make more sense with output_variance.
Or if there is any other strategy for this that I might be unaware of, please let me know.
The problem mentioned in the question required another matrix of the same dimension as A that corresponded to a variance measure for the random samples present in A.
Taking a row-wise or column-wise variance of the matrix denoted by A using numpy.var() didn't give a similar 10 x 4 matrix to multiply with y_variance.
I had solved the above problem by using the following approach:
First create a matrix with the same dimensions as A with zero entries, using the following line of code:
A_var = np.zeros_like(A)
then, using torch.distributions, create normal distributions with the values in A as the mean and zeroes as variance:
dist_A = torch.distributions.normal.Normal(loc=torch.Tensor(A), scale=torch.Tensor(A_var))
https://pytorch.org/docs/stable/distributions.html lists all the operations possible on Normal distributions in PyTorch. The sample() method can generate samples from a given distribution for any size. This property was exploited to first generate a sample matrix of size 10 X 10 x 4 and then calculating the variance along axis 0.
np.var(np.array(dist2.sample((10,))),axis=0)
This would result in a variance matrix of size 10 x 4, which can be used for calculations with y_variance.

How to interpret this fft graph

I want to apply Fourier transformation using fft function to my time series data to find "patterns" by extracting the dominant frequency components in the observed data, ie. the lowest 5 dominant frequencies to predict the y value (bacteria count) at the end of each time series.
I would like to preserve the smallest 5 coefficients as features, and eliminate the rest.
My code is as below:
df = pd.read_csv('/content/drive/My Drive/df.csv', sep=',')
X = df.iloc[0:2,0:10000]
dft_X = np.fft.fft(X)
print(dft_X)
print(len(dft_X))
plt.plot(dft_X)
plt.grid(True)
plt.show()
# What is the graph about(freq/amplitude)? How much data did it use?
for i in dft_X:
m = i[np.argpartition(i,5)[:5]]
n = i[np.argpartition(i,range(5))[:5]]
print(m,'\n',n)
Here is the output:
But I am not sure how to interpret this graph. To be precise,
1) Does the graph show the transformed values of the input data? I only used 2 rows of data(each row is a time series), thus data is 2x10000, why are there so many lines in the graph?
2) To obtain frequency value, should I use np.fft.fftfreq(n, d=timestep)?
Parameters:
n : int
Window length.
d : scalar, optional
Sample spacing (inverse of the sampling rate). Defaults to 1.
Returns:
f : ndarray
Array of length n containing the sample frequencies.
How to determine n(window length) and sample spacing?
3) Why are transformed values all complex numbers?
Thanks
I'm gonna answer in reverse order of your questions
3) Why are transformed values all complex numbers?
The output of a Fourier Transform is always complex numbers. To get around this fact, you can either apply the absolute value on the output of the transform, or only plot the real part using:
plt.plot(dft_X.real)
2) To obtain frequency value, should I use np.fft.fftfreq(n, d=timestep)?
No, the "frequency values" will be visible on the output of the FFT.
1) Does the graph show the transformed values of the input data? I only used 2 rows of data(each row is a time series), thus data is 2x10000, why are there so many lines in the graph?
Your graph has so many lines because it's making a line for each column of your data set. Apply the FFT on each row separately (or possibly just transpose your dataframe) and then you'll get more actual frequency domain plots.
Follow up
Would using absolute value or real part of the output as features for a later model have different effect than using the original output?
Absolute values are easier to work with usually.
Using real part
Using absolute value
Here's the Octave code that generated this:
Fs = 4000; % Sampling rate of signal
T = 1/Fs; % Period
L = 4000; % Length of signal
t = (0:L-1)*T; % Time axis
freq = 1000; % Frequency of our sinousoid
sig = sin(freq*2*pi*t); % Fill Time-Domain with 1000 Hz sinusoid
f_sig = fft(sig); % Apply FFT
f = Fs*(0:(L/2))/L; % Frequency axis
figure
plot(f,abs(f_sig/L)(1:end/2+1)); % peak at 1kHz)
figure
plot(f,real(f_sig/L)(1:end/2+1)); % main peak at 1kHz)
In my example, you can see the absolute value returned no noise at frequencies other than the sinusoid of frequency 1kHz I generated while the real part had a bigger peak at 1kHz but also had much more noise.
As for effects, I don't know what you mean by that.
is it expected that "frequency values" always be complex numbers
Always? No. The Fourier series represents the frequency coefficients at which the sum of sines and cosines completely equate any continuous periodic function. Sines and cosines can be written in complex forms through Euler's formula. This is the most convenient way to store Fourier coefficients. In truth, the imaginary part of your frequency-domain signal represents the phase of the signal. (i.e if I have 2 sine functions of the same frequency, they can have different complex forms depending on the time shifting). However, most libraries that provide an FFT function will, by default, store FFT coefficients as complex numbers, to facilitate phase and magnitude calculations.
Is it convention that FFT use each column of dataset when plotting a line
I think it is an issue with mathplotlib.plot, not np.fft.
Could you please show me how to apply FFT on each row separately
There are many ways to go around this and I don't want to force you down one path, so I will propose the general solution to iterate over each row of your dataframe and apply the FFT on each specific row. Otherwise, in your case, I believe transposing your output could also work.

How to combine the phase of one image and magnitude of different image into 1 image by using python

I want to combine phase spectrum of one image and magnitude spectrum of different image into one image.
I have got phase spectrum and magnitude spectrum of image A and image B.
Here is the code.
f = np.fft.fft2(grayA)
fshift1 = np.fft.fftshift(f)
phase_spectrumA = np.angle(fshift1)
magnitude_spectrumB = 20*np.log(np.abs(fshift1))
f2 = np.fft.fft2(grayB)
fshift2 = np.fft.fftshift(f2)
phase_spectrumB = np.angle(fshift2)
magnitude_spectrumB = 20*np.log(np.abs(fshift2))
I trying to figure out , but still i do not know how to do that.
Below is my test code.
imgCombined = abs(f) * math.exp(1j*np.angle(f2))
I wish i can come out just like that
Here are the few things that you would need to fix for your code to work as intended:
The math.exp function supports scalar exponentiation. For an element-wise matrix exponentiation you should use numpy.exp instead.
Similary, the * operator would attempt to perform matrix multiplication. In your case you want to instead perform element-wise multiplication which can be done with np.multiply
With these fixes you should get the frequency-domain combined matrix as follows:
combined = np.multiply(np.abs(f), np.exp(1j*np.angle(f2)))
To obtain the corresponding spatial-domain image, you would then need compute the inverse transform (and take the real part since there my be residual small imaginary parts due to numerical errors) with:
imgCombined = np.real(np.fft.ifft2(combined))
Finally the result can be shown with:
import matplotlib.pyplot as plt
plt.imshow(imgCombined, cmap='gray')
Note that imgCombined may contain values outside the [0,1] range. You would then need to decide how you want to rescale the values to fit the expected [0,1] range.
The default scaling (resulting in the image shown above) is to linearly scale the values such that the minimum value is set to 0, and the maximum value is set to 0.
Another way could be to limit the values to that range (i.e. forcing all negative values to 0 and all values greater than 1 to 1).
Finally another approach, which seems to provide a result closer to the screenshot provided, would be to take the absolute value with imgCombined = np.abs(imgCombined)

How do I scale an FFT-based cross-correlation such that its peak is equal to Pearson's rho

Description of the problem
FFT can be used to compute cross-correlation between two signals or images. To determine the delay or lag between two signals A and B, it suffices to locate the peak of:
IFFT(FFT(A)*conjugate(FFT(B)))
However, the amplitude of the peak is related to the amplitude of the frequency spectra of the individual signals. Thus to determine the Pearson correlation (rho), the amplitude of this peak must be scaled by the total energy in the two signals.
One way to do this is to normalize by the geometric mean of the individual autocorrelations. This gives a reasonable approximation of rho, especially when the delay between samples is small, but not the exact value.
I thought the reason for this error was that the Pearson correlation is only defined for the overlapping portions of the signal, whereas the normalization factor (the geometric mean of the two autocorrelation peaks) includes contributions from the non-overlapping portions. I considered two approaches for fixing this and producing an exact value for rho via FFT. In the first (called rho_exact_1 below), I trimmed the samples down to their overlapping portions and computed the normalization factor from those. In the second (called rho_exact_2 below), I computed the fraction of the measurements contained in the overlapping portion of the signals and multiplied the full-autocorrelation-normalization factor by that fraction.
Neither works! The figure below shows plots of the three approaches for calculating Pearson's rho using DFT-based cross-correlation. Only the region of the cross-correlation peak is shown. Each estimate is close to the correct value of 1.0, but not equal to it.
The code I used to perform the calculations is below. I used a simple sine wave as an example signal. I noticed that if I use a square-wave (w/ duty cycle not necessarily 50%) the approaches' errors change.
Can somebody explain what's going on?
A working example
import numpy as np
from matplotlib import pyplot as plt
# make a time vector w/ 256 points
# and a source signal
N_cycles = 10.0
N_points = 256.0
t = np.arange(0,N_cycles*np.pi,np.pi*N_cycles/N_points)
signal = np.sin(t)
use_rect = False
if use_rect:
threshold = -0.75
signal[np.where(signal>=threshold)]=1.0
signal[np.where(signal<threshold)]=-1.0
# normalize the signal (not technically
# necessary for this example, but required
# for measuring correlation of physically
# different signals)
signal = signal/signal.std()
# generate two samples of the signal
# with a temporal offset:
N = 128
offset = 5
sample_1 = signal[:N]
sample_2 = signal[offset:N+offset]
# determine the offset through cross-
# correlation
xc_num = np.abs(np.fft.ifft(np.fft.fft(sample_1)*np.fft.fft(sample_2).conjugate()))
offset_estimate = np.argmax(xc_num)
if offset_estimate>N//2:
offset_estimate = offset_estimate - N
# for an approximate estimate of Pearson's
# correlation, we normalize by the RMS
# of individual autocorrelations:
autocorrelation_1 = np.abs(np.fft.ifft(np.fft.fft(sample_1)*np.fft.fft(sample_1).conjugate()))
autocorrelation_2 = np.abs(np.fft.ifft(np.fft.fft(sample_2)*np.fft.fft(sample_2).conjugate()))
xc_denom_approx = np.sqrt(np.max(autocorrelation_1))*np.sqrt(np.max(autocorrelation_2))
rho_approx = xc_num/xc_denom_approx
print 'rho_approx',np.max(rho_approx)
# this is an approximation because we've
# included autocorrelation of the whole samples
# instead of just the overlapping portion;
# using cropped versions of the samples should
# yield the correct correlation:
sample_1_cropped = sample_1[offset:]
sample_2_cropped = sample_2[:-offset]
# these should be identical vectors:
assert np.all(sample_1_cropped==sample_2_cropped)
# compute autocorrelations of cropped samples
# and corresponding value for rho
autocorrelation_1_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_1_cropped)*np.fft.fft(sample_1_cropped).conjugate()))
autocorrelation_2_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_2_cropped)*np.fft.fft(sample_2_cropped).conjugate()))
xc_denom_exact_1 = np.sqrt(np.max(autocorrelation_1_cropped))*np.sqrt(np.max(autocorrelation_2_cropped))
rho_exact_1 = xc_num/xc_denom_exact_1
print 'rho_exact_1',np.max(rho_exact_1)
# alternatively we could try to use the
# whole sample autocorrelations and just
# scale by the number of pixels used to
# compute the numerator:
scaling_factor = float(len(sample_1_cropped))/float(len(sample_1))
rho_exact_2 = xc_num/(xc_denom_approx*scaling_factor)
print 'rho_exact_2',np.max(rho_exact_2)
# finally a sanity check: is rho actually 1.0
# for the two signals:
rho_corrcoef = np.corrcoef(sample_1_cropped,sample_2_cropped)[0,1]
print 'rho_corrcoef',rho_corrcoef
x = np.arange(len(rho_approx))
plt.plot(x,rho_approx,label='FFT rho_approx')
plt.plot(x,rho_exact_1,label='FFT rho_exact_1')
plt.plot(x,rho_exact_2,label='FFT rho_exact_2')
plt.plot(x,np.ones(len(x))*rho_corrcoef,'k--',label='Pearson rho')
plt.legend()
plt.ylim((.75,1.25))
plt.xlim((0,20))
plt.show()
The normalised cross correlation between two N-periodic discrete signals F and G is defined as:
Since the numerator is a dot product between two vectors (F and G_x) and the denominator is the product of the norm of these two vectors, the scalar r_x must indeed lie between -1 and +1 and it is the cosinus of the angle between the vectors (See there). If the vector F and G_x are aligned, then r_x=1. If r_x=1, then the vector F and G_x are aligned due to the triangular inequality. To ensure these properties, the vectors at the numerator must match those at the denominator.
All numerators can be computed at once by using the Discrete Fourier Transform. Indeed, that transform turns the convolution into pointwise products in the Fourier space. Here is why the different estimated normalized cross correlations are not 1 in the tests you performed.
For the first test "approx", sample_1 and sample_2 are both extracted from a periodic signal. Both are of the same length, but the length is not a multiple of the period as it is 2.5 periods (5pi) (figure below). As a result, since the dft performs the correlation as if they where periodic signals, it is found that sample_1 and sample_2 are not perfectly correlated and r_x<1.
For the second test rho_exact_1, the convolution is performed on signals of length N=128, but the norms at the denominator are computed on truncated vectors of size N-offset=128-5. As a result, the properties of r_x are lost. In addition, it must be noticed that the proposed convolution and norms are not normalized: the computed norms and convolution product are globally proportionnal to the number of points of the considered vectors. As a result, the norms of the truncated vectors are slightly lower compared to the previous case and r_x increases: values larger that 1 are likely encountered as the offset increases.
For the third test rho_exact_2, a scaling factor is introduced to try to correct the first test: the properties of r_x are also lost and values larger than one can be encountered as the scaling factor is larger than one.
Nevertheless, the function corrcoef() of numpy actually computes a r_x equal to 1 for the truncated signals. Indeed, these signals are perfectly identical! The same result can be obtained using DFTs:
xc_num_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_1_cropped)*np.fft.fft(sample_2_cropped).conjugate()))
autocorrelation_1_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_1_cropped)*np.fft.fft(sample_1_cropped).conjugate()))
autocorrelation_2_cropped = np.abs(np.fft.ifft(np.fft.fft(sample_2_cropped)*np.fft.fft(sample_2_cropped).conjugate()))
xc_denom_exact_11 = np.sqrt(np.max(autocorrelation_1_cropped))*np.sqrt(np.max(autocorrelation_2_cropped))
rho_exact_11 = xc_num_cropped/xc_denom_exact_11
print 'rho_exact_11',np.max(rho_exact_11)
To provide the user with a significant value for r_x, you can stick to the value provided by the first test, which can be lower than one for identical periodic signals if the length of the frame is not a multiple of the period. To correct this drawback, the estimated offset can also be retreived and used to build two cropped signals of the same length. The whole correlation procedure must be re-run to get a new value for r_x, which will not be plaged by the fact that the length of the cropped frame is not a multiple of the period.
Lastly, if the DFT is a very efficient way to compute the convolution at the numerator for all values of x at once, the denominator can be efficiently computed as 2-norms of vector, using numpy.linalg.norm. Since the argmax(r_x) for the cropped signals will likely be zero if the first correlation was successful, it could be sufficient to compute r_0 using a dot product `sample_1_cropped.dot(sample_2_cropped).

How to change parameters of a scikit learn function dynamically i.e. find best parameter

I am trying to do dimensionality reduction using PCA function of sklearn, specifically
from sklearn.decomposition import PCA
def mypca(X,comp):
pca = PCA(n_components=comp)
pca.fit(X)
PCA(copy=True, n_components=comp, whiten=False)
Xpca = pca.fit_transform(X)
return Xpca
for n_comp in range(10,1000,20):
Xpca = mypca(X,n_comp) # X is a 2 dimensional array
print Xpca
I am calling mypca function from a loop with different values for comp. I am doing this in order to find the best value of comp for the problem I am trying to solve. But mypca function always returns the same value i.e. Xpca irrespective of value of comp.
The value it returns is correct for first value of comp I send from the loop i.e. Xpca value which it sends each time is correct for comp = 10 in my case.
What should I do in order to find best value of comp?
You use PCA to reduce the dimension.
From your code:
for n_comp in range(10,1000,20):
Xpca = mypca(X,n_comp) # X is a 2 dimensional array
print Xpca
Your input dataset X is only a 2 dimensional array, the minimum n_comp is 10, so the PCA try to find the 10 best dimension for you. Since 10 > 2, you will always get the same answer. :)
It looks like you're trying to pass different values for number of components, and re-fit with each. A great thing about PCA is that it's actually not necessary to do this. You can fit the full number of components (even as many components as dimensions in your dataset), then simply discard the components you don't want (i.e. those with small variance). This is equivalent to re-fitting the entire model with fewer components. Saves a lot of computation.
How to do it:
# x = input data, size(<points>, <dimensions>)
# fit the full model
max_components = x.shape[1] # as many components as input dimensions
pca = PCA(n_components=max_components)
pca.fit(x)
# transform the data (contains all components)
y_all = pca.transform(x)
# keep only the top k components (with greatest variance)
k = 2
y = y_all[:, 0:k]
In terms of how to select the number of components, it depends what you want to do. One standard way of choosing the number of components k is to look at the fraction of variance explained (R^2) by each choice of k. If your data is distributed near a low-dimensional linear subspace, then when you plot R^2 vs. k, the curve will have an 'elbow' shape. The elbow will be located at the dimensionality of the subspace. It's good practice to look at this curve because it helps understand the data. Even if there's no clean elbow, it's common to choose a threshold value for R^2, e.g. to preserve 95% of the variance.
Here's how to do it (this should be done on the model with max_components components):
# Calculate fraction of variance explained
# for each choice of number of components
r2 = pca.explained_variance_.cumsum() / x.var(0).sum()
Another way you might want to proceed is to take the PCA-transformed data and feed it to a downstream algorithm (e.g. classifier/regression), then select your number of components based on the performance (e.g. using cross validation).
Side note: Maybe just a formatting issue, but your code block in mypca() should be indented, or it won't be interpreted as part of the function.

Categories

Resources