I have a morlet wavelet which is described by a plane wave multiplied with a gaussian window, and a scaling parameter, s. I.e. in python language:
import numpy
f = 10
omega = 2*numpy.pi*f
x = numpy.linspace(-5,5,num=1000)
wavelet = numpy.exp(numpy.complex(0,1)*omega*x/s) * numpy.exp(-1.0*(x/s)**2/2.0)
Usually doubling the scaling parameter (also known as "level") of a wavelet halves its bandwidth. Plotting the FFT of the wavelet described above for different scales, s = 2**i, with i=1,2,3, ... the width is not halved for subsequent i.
Whats wrong with the morlet wavelet?
The above code that you have provided does not look (to me) as if it is properly constructing the Morlet wavelet. The paper A Practical Guide to
Wavelet Analysis provides a great guide to the construction of Wavelet transforms and should provide an explanation to the effect of varying the wavelet scale.
Note, depending on your implementation, changing the wavelet scale will not update/change the scale of the FFT used to create the wavelet. Typically, the FFT is constructed, then it is used to construct the Descreet Wavelet Transform. Thus, changing the wavelet scale will not effect the underlying FFT.
I hope this helps.
Related
I tried to reproduce Watson's spectrum plot from these set of slides (PDF p. 30, p.29 of the slides), that came from this data of housing building permits.
Watson achieves a very smooth spectrum curve in which it is very easy to tell the peak frequencies.
When I tried to run a FFT on the data, I get a really noisy spectrum curve and I wonder if there is an intermediate step that I am missing.
I ran the fourier analysis on python, using scipy package fftpack as follows:
from scipy import fftpack
fs = 1 / 12 # monthly
N = data.shape[0]
spectrum = fftpack.fft(data.PERMITNSA.values)
freqs = fftpack.fftfreq(len(spectrum)) #* fs
plt.plot(freqs[:N//2], 20 * np.log10(np.abs(spectrum[:N//2])))
Could anyone help me with the missing link?
The original data is:
Below is the Watson's spectrum curve, the one I tried to reproduce:
And these are my results:
The posted curve doesn't look realistic. But there are many methods to get a smooth result with a similar amount of "curviness", using various kinds of resampling and/or plot interpolation.
One method I like is to chop the data into segments (windows, possibly overlapped) roughly 4X longer than the maximum number of "bumps" you want to see, maybe a bit longer. Then window each segment before using a much longer (size of about the resolution of the final plot you want) zero-padded FFT. Then average the results of the multiple FFTs of the multiple windowed segments. This works because a zero-padded FFT is (almost) equivalent to a highest-quality Sinc interpolating low-pass filter.
I've been tasked to develop an algorithm that, given a set of sparse points representing measurements of an existing surface, would allow us to compute the z coordinate of any point on the surface. The challenge is to find a suitable interpolation method that can recreate the 3D surface given only a few points and extrapolate values also outside of the range containing the initial measurements (a notorious problem for many interpolation methods).
After trying to fit many analytic curves to the points I've decided to use RBF interpolation as I thought this will better reproduce the surface given that the points should all lie on it (I'm assuming the measurements have a negligible error).
The first results are quite impressive considering the few points that I'm using.
Interpolation results
In the picture that I'm showing the blue points are the ones used for the RBF interpolation which produces the shape represented in gray scale. The red points are instead additional measurements of the same shape that I'm trying to reproduce with my interpolation algorithm.
Unfortunately there are some outliers, especially when I'm trying to extrapolate points outside of the area where the initial measurements were taken (you can see this in the upper right and lower center insets in the picture). This is to be expected, especially in RBF methods, as I'm trying to extract information from an area that initially does not have any.
Apparently the RBF interpolation is trying to flatten out the surface while I would just need to continue with the curvature of the shape. Of course the method does not know anything about that given how it is defined. However this causes a large discrepancy from the measurements that I'm trying to fit.
That's why I'm asking if there is any way to constrain the interpolation method to keep the curvature or use a different radial basis function that doesn't smooth out so quickly only on the border of the interpolation range. I've tried different combination of the epsilon parameters and distance functions without luck. This is what I'm using right now:
from scipy import interpolate
import numpy as np
spline = interpolate.Rbf(df.X.values, df.Y.values, df.Z.values,
function='thin_plate')
X,Y = np.meshgrid(np.linspace(xmin.round(), xmax.round(), precision),
np.linspace(ymin.round(), ymax.round(), precision))
Z = spline(X, Y)
I was also thinking of creating some additional dummy points outside of the interpolation range to constrain the model even more, but that would be quite complicated.
I'm also attaching an animation to give a better idea of the surface.
Animation
Just wanted to post my solution in case someone has the same problem. The issue was indeed with scipy implementation of the RBF interpolation. I tried instead to adopt a more flexible library, https://rbf.readthedocs.io/en/latest/index.html#.
The results are pretty cool! Using the following options
from rbf.interpolate import RBFInterpolant
spline = RBFInterpolant(X_obs, U_obs, phi='phs5', order=1, sigma=0.0, eps=1.)
I was able to get the right shape even at the edge.
Surface interpolation
I've played around with the different phi functions and here is the boxplot of the spread between the interpolated surface and the points that I'm testing the interpolation against (the red points in the picture).
Boxplot
With phs5 I get the best result with an average spread of about 0.5 mm on the upper surface and 0.8 on the lower surface. Before I was getting a similar average but with many outliers > 15 mm. Definitely a success :)
I am trying to evaluate the frequency domain of several signals. For this I used the PSD implementation given in this answer. As a comparison I used the signal.periodogram function provided in scipy:
from scipy.signal import tukey
import scipy as sp
f, Pxx_den = sp.signal.periodogram(a_gtrend_orig,12,window=tukey( len(a_gtrend_orig) ))
However when I plot this next to the self-implemented PSD they look significantly different:
As the same window function is used and the periodogram function should also use an FFT where does this difference coming from?
The example that you are comparing this to, is graphing the amplitude at each frequency bin, i.e, abs(fft())
The periodogram produces a power spectral density, that means it is the square of the amplitude at each frequency bin.
The label "windowed psd" is from an early edit, and was corrected later.
I have this signal, for which I want to calculate the dominant wavelength, which would be the distance between the pronounced minima where the oscillations occure:
Which tool in scipy should I look into for this mission?
It depends where you get the data from.
If you only have the (x,y) points of the graph, you can either hack it by taking all the x corresponding to the minimal y (be careful of floating-point equalities, though), or use the Fourier transform, identify the main wave (the biggest amplitude) and deduce its wavelength. For the latter, you would use the Fast Fourier Transform from scipy: https://docs.scipy.org/doc/scipy-0.18.1/reference/tutorial/fftpack.html#fast-fourier-transforms
If you have the functional description of the function, either sample it like you do to construct the graph and apply the above, or take its derivative to find the minima mathematically (best method). You could also use scipy to find the minima numerically (https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.optimize.minimize.html), but you have to manually specify intervals that contain only one local minimum.
I am interested in using the scipy peak finding code to find peaks in a 1-d vector. While looking through the source to get a better understanding of how it works, I ran across how the wavelets are generated for use in the CWT function.
https://github.com/scipy/scipy/blob/v0.14.0/scipy/signal/wavelets.py#L359
The line in question is 359,
wavelet_data = wavelet(min(10 * width, len(data)), width)
It looks like all of the wavelets are vectors that are multiples of 10 in size. If the wavelet is "ricker", then all of the scaled wavelets are missing that nice peaked center point because they are even in length. It seems like if they were all odd length then at the lower scales the correlations with sharp pointy peaks might be slightly better, probably makes much less of a difference at the higher scales.
Does it matter if the length of the wavelet vector used for correlation is odd or even in length? What about for the case of peak finding?
I am not a wavelet expert so if I have missed something obvious, please let me know.