Recreating time series data using FFT results without using ifft

Recreating time series data using FFT results without using ifft - python

I analyzed the sunspots.dat data (below) using fft which is a classic example in this area. I obtained results from fft in real and imaginery parts. Then I tried to use these coefficients (first 20) to recreate the data following the formula for Fourier transform. Thinking real parts correspond to a_n and imaginery to b_n, I have
import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack
def f(Y,x):
total = 0
for i in range(20):
total += Y.real[i]*np.cos(i*x) + Y.imag[i]*np.sin(i*x)
return total
tempdata = np.loadtxt("sunspots.dat")
year=tempdata[:,0]
wolfer=tempdata[:,1]
Y=fft(wolfer)
n=len(Y)
print n
xs = linspace(0, 2*pi,1000)
gplt.plot(xs, [f(Y, x) for x in xs], '.')
gplt.show()
For some reason however, my plot does not mirror the one generated by ifft (I use the same number of coefficients on both sides). What could be wrong ?
Data:
http://linuxgazette.net/115/misc/andreasen/sunspots.dat

When you called fft(wolfer), you told the transform to assume a fundamental period equal to the length of the data. To reconstruct the data, you have to use basis functions of the same fundamental period = 2*pi/N. By the same token, your time index xs has to range over the time samples of the original signal.
Another mistake was in forgetting to do to the full complex multiplication. It's easier to think of this as Y[omega]*exp(1j*n*omega/N).
Here's the fixed code. Note I renamed i to ctr to avoid confusion with sqrt(-1), and n to N to follow the usual signal processing convention of using the lower case for a sample, and the upper case for total sample length. I also imported __future__ division to avoid confusion about integer division.
forgot to add earlier: Note that SciPy's fft doesn't divide by N after accumulating. I didn't divide this out before using Y[n]; you should if you want to get back the same numbers, rather than just seeing the same shape.
And finally, note that I am summing over the full range of frequency coefficients. When I plotted np.abs(Y), it looked like there were significant values in the upper frequencies, at least until sample 70 or so. I figured it would be easier to understand the result by summing over the full range, seeing the correct result, then paring back coefficients and seeing what happens.
from __future__ import division
import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack
def f(Y,x, N):
total = 0
for ctr in range(len(Y)):
total += Y[ctr] * (np.cos(x*ctr*2*np.pi/N) + 1j*np.sin(x*ctr*2*np.pi/N))
return real(total)
tempdata = np.loadtxt("sunspots.dat")
year=tempdata[:,0]
wolfer=tempdata[:,1]
Y=fft(wolfer)
N=len(Y)
print(N)
xs = range(N)
gplt.plot(xs, [f(Y, x, N) for x in xs])
gplt.show()

The answer from mtrw was extremely helpful and helped me answer the same question as the OP, but my head almost exploded trying to understand the nested loop.
Here's the last part but with numpy broadcasting (not sure if this even existed when the question was asked) rather than calling the f function:
xs = np.arange(N)
omega = 2*np.pi/N
phase = omega * xs[:,None] * xs[None,:]
reconstruct = Y[None,:] * (np.cos(phase) + 1j*np.sin(phase))
reconstruct = (reconstruct).sum(axis=1).real / N
# same output
plt.plot(reconstruct)
plt.plot(wolfer)

Related

Why is my code using 4th Runge-Kutta isn't giving me the expected values?

I'm having a little trouble trying to understand what's wrong with me code, any help would be extremely helpful.
I wanted to solve this simple equation
However, the values my code gives doesn't match with my book ones or wolfram ones as y goes up as x grows.
import matplotlib.pyplot as plt
from numpy import exp
from scipy.integrate import ode
# initial values
y0, t0 = [1.0], 0.0
def f(t, y):
f = [3.0*y[0] - 4.0/exp(t)]
return f
# initialize the 4th order Runge-Kutta solver
r = ode(f).set_integrator('dopri5')
r.set_initial_value(y0, t0)
t1 = 10
dt = 0.1
x, y = [], []
while r.successful() and r.t < t1:
x.append(r.t+dt); y.append(r.integrate(r.t+dt))
print(r.t+dt, r.integrate(r.t+dt))

Your equation in general has the solution
y(x) = (y0-1)*exp(3*x) + exp(-x)
Due to the choice of initial conditions, the exact solution does not contain the growing component of the first term. However, small perturbations due to discretization and floating point errors will generate a non-zero coefficient in the growing term. Now at the end of the integration interval this random coefficient is multiplied by exp(3*10)=1.107e+13 which will magnify small discretization errors of size 1e-7 to contributions in the result of size 1e+6 as observed when running the original code.
You can force the integrator to be more precise in its internal steps without reducing the output step size dt by setting error thresholds like in
r = ode(f).set_integrator('dopri5', atol=1e-16, rtol=1e-20)
However, you can not avoid the deterioration of the result completely as the floating point errors of size 1e-16 get magnified to global error contributions of size 1e-3.
Also, you should notice that each call of r.integrate(r.t+dt) will advance the integrator by dt so that the stored array and the printed values are in lock-step. If you want to just print the current state of the integrator use
print(r.t,r.y,yexact(r.t,y0))
where the last is to compare to the exact solution which is, as already said,
def yexact(x,y0):
return [ (y0[0]-1)*exp(3*x)+exp(-x) ]

Graphing an equation subject to constraints

Here is an image of the formula I am supposed to use, and some sample graphs that it should look like.
Here is an image that states the question I am working on.
What I have so far:
import numpy as np
import matplotlib.pyplot asplt
import math
def graph (formula):
x = np.arrange(-4,4,0.1)
y = formula(x)
plt.plot(x,y)
plt.show()
def my_formula(x):
return ((n**(n-.5))/(math.factorial(n-1)))*((1+x/(math.sqrt(n)))**(n-1))*
(math.e**(-n*(1+x/(math.sqrt(n)))))
n=1
graph(my_formula)
What I can't figure out is how to include the x>-sqrt(n) constraint into the equation. Any help at all would be much appreciated!!
-This is for a class that's not even about programming, yet we have to do this sort of stuff anyway, so I'm really not that great at it

The x > sqrt(n) constraint is putting a limit on the range of x-values, and you can easily do this when you create the x array.
Below is one way to do this (see the line with max and the subsequent line), but I also set the xlim of the graph to the -4 to 4 range to make it easy to compare different n-values:
import numpy as np
import matplotlib.pyplot as plt
from math import factorial, sqrt
def graph(formula, n):
xmin = max(-sqrt(n), -4) # set the lower x limit for the calculation
x = np.arange(xmin,4,0.1)
y = formula(x, n)
plt.plot(x,y)
plt.xlim(-4, 4) # set the graph limits to the full range
plt.show()
def my_formula(x, n):
z = 1 + x/sqrt(n)
a = n**(n-.5)/factorial(n-1)
b = z**(n-1)
c = np.exp(-n*z)
return a*b*c
n = 4
graph(my_formula, n)
The graph below has n=4 so xmin=-2.
Also, here I rewrote your code a bit too, though these were mostly for clarity. First, I passed in n as a variable, since the multiple graphs are changing n (and if you want to treat n as a constant, put it directly after the imports so people can find it). Second, I imported the math functions into the global namespace as math equations written in code are complicated enough without all the weight from the namespaces. Third, I broke the three terms of the equation into bite sized pieces. Overall, readability is important.

How to plot grad(f(x,y))?

I want to calculate and plot a gradient of any scalar function of two variables. If you really want a concrete example, lets say f=x^2+y^2 where x goes from -10 to 10 and same for y. How do I calculate and plot grad(f)? The solution should be vector and I should see vector lines. I am new to python so please use simple words.
EDIT:
#Andras Deak: thank you for your post, i tried what you suggested and instead of your test function (fun=3*x^2-5*y^2) I used function that i defined as V(x,y); this is how the code looks like but it reports an error
import numpy as np
import math
import sympy
import matplotlib.pyplot as plt
def V(x,y):
t=[]
for k in range (1,3):
for l in range (1,3):
t.append(0.000001*np.sin(2*math.pi*k*0.5)/((4*(math.pi)**2)* (k**2+l**2)))
term = t* np.sin(2 * math.pi * k * x/0.004) * np.cos(2 * math.pi * l * y/0.004)
return term
return term.sum()
x,y=sympy.symbols('x y')
fun=V(x,y)
gradfun=[sympy.diff(fun,var) for var in (x,y)]
numgradfun=sympy.lambdify([x,y],gradfun)
X,Y=np.meshgrid(np.arange(-10,11),np.arange(-10,11))
graddat=numgradfun(X,Y)
plt.figure()
plt.quiver(X,Y,graddat[0],graddat[1])
plt.show()
AttributeError: 'Mul' object has no attribute 'sin'
And lets say I remove sin, I get another error:
TypeError: can't multiply sequence by non-int of type 'Mul'
I read tutorial for sympy and it says "The real power of a symbolic computation system such as SymPy is the ability to do all sorts of computations symbolically". I get this, I just dont get why I cannot multiply x and y symbols with float numbers.
What is the way around this? :( Help please!
UPDATE
#Andras Deak: I wanted to make things shorter so I removed many constants from the original formulas for V(x,y) and Cn*Dm. As you pointed out, that caused the sin function to always return 0 (i just noticed). Apologies for that. I will update the post later today when i read your comment in details. Big thanks!
UPDATE 2
I changed coefficients in my expression for voltage and this is the result:
It looks good except that the arrows point in the opposite direction (they are supposed to go out of the reddish dot and into the blue one). Do you know how I could change that? And if possible, could you please tell me the way to increase the size of the arrows? I tried what was suggested in another topic (Computing and drawing vector fields):
skip = (slice(None, None, 3), slice(None, None, 3))
This plots only every third arrow and matplotlib does the autoscale but it doesnt work for me (nothing happens when i add this, for any number that i enter)
You were already of huge help , i cannot thank you enough!

Here's a solution using sympy and numpy. This is the first time I use sympy, so others will/could probably come up with much better and more elegant solutions.
import sympy
#define symbolic vars, function
x,y=sympy.symbols('x y')
fun=3*x**2-5*y**2
#take the gradient symbolically
gradfun=[sympy.diff(fun,var) for var in (x,y)]
#turn into a bivariate lambda for numpy
numgradfun=sympy.lambdify([x,y],gradfun)
now you can use numgradfun(1,3) to compute the gradient at (x,y)==(1,3). This function can then be used for plotting, which you said you can do.
For plotting, you can use, for instance, matplotlib's quiver, like so:
import numpy as np
import matplotlib.pyplot as plt
X,Y=np.meshgrid(np.arange(-10,11),np.arange(-10,11))
graddat=numgradfun(X,Y)
plt.figure()
plt.quiver(X,Y,graddat[0],graddat[1])
plt.show()
UPDATE
You added a specification for your function to be computed. It contains the product of terms depending on x and y, which seems to break my above solution. I managed to come up with a new one to suit your needs. However, your function seems to make little sense. From your edited question:
t.append(0.000001*np.sin(2*math.pi*k*0.5)/((4*(math.pi)**2)* (k**2+l**2)))
term = t* np.sin(2 * math.pi * k * x/0.004) * np.cos(2 * math.pi * l * y/0.004)
On the other hand, from your corresponding comment to this answer:
V(x,y) = Sum over n and m of [Cn * Dm * sin(2pinx) * cos(2pimy)]; sum goes from -10 to 10; Cn and Dm are coefficients, and i calculated
that CkDl = sin(2pik)/(k^2 +l^2) (i used here k and l as one of the
indices from the sum over n and m).
I have several problems with this: both sin(2*pi*k) and sin(2*pi*k/2) (the two competing versions in the prefactor are always zero for integer k, giving you a constant zero V at every (x,y). Furthermore, in your code you have magical frequency factors in the trigonometric functions, which are missing from the comment. If you multiply your x by 4e-3, you drastically change the spatial dependence of your function (by changing the wavelength by roughly a factor of a thousand). So you should really decide what your function is.
So here's a solution, where I assumed
V(x,y)=sum_{k,l = 1 to 10} C_{k,l} * sin(2*pi*k*x)*cos(2*pi*l*y), with
C_{k,l}=sin(2*pi*k/4)/((4*pi^2)*(k^2+l^2))*1e-6
This is a combination of your various versions of the function, with the modification of sin(2*pi*k/4) in the prefactor in order to have a non-zero function. I expect you to be able to fix the numerical factors to your actual needs, after you figure out the proper mathematical model.
So here's the full code:
import sympy as sp
import numpy as np
import matplotlib.pyplot as plt
def CD(k,l):
#return sp.sin(2*sp.pi*k/2)/((4*sp.pi**2)*(k**2+l**2))*1e-6
return sp.sin(2*sp.pi*k/4)/((4*sp.pi**2)*(k**2+l**2))*1e-6
def Vkl(x,y,k,l):
return CD(k,l)*sp.sin(2*sp.pi*k*x)*sp.cos(2*sp.pi*l*y)
def V(x,y,kmax,lmax):
k,l=sp.symbols('k l',integers=True)
return sp.summation(Vkl(x,y,k,l),(k,1,kmax),(l,1,lmax))
#define symbolic vars, function
kmax=10
lmax=10
x,y=sp.symbols('x y')
fun=V(x,y,kmax,lmax)
#take the gradient symbolically
gradfun=[sp.diff(fun,var) for var in (x,y)]
#turn into bivariate lambda for numpy
numgradfun=sp.lambdify([x,y],gradfun,'numpy')
numfun=sp.lambdify([x,y],fun,'numpy')
#plot
X,Y=np.meshgrid(np.linspace(-10,10,51),np.linspace(-10,10,51))
graddat=numgradfun(X,Y)
fundat=numfun(X,Y)
hf=plt.figure()
hc=plt.contourf(X,Y,fundat,np.linspace(fundat.min(),fundat.max(),25))
plt.quiver(X,Y,graddat[0],graddat[1])
plt.colorbar(hc)
plt.show()
I defined your V(x,y) function using some auxiliary functions for transparence. I left the summation cut-offs as literal parameters, kmax and lmax: in your code these were 3, in your comment they were said to be 10, and anyway they should be infinity.
The gradient is taken the same way as before, but when converting to a numpy function using lambdify you have to set an additional string parameter, 'numpy'. This will alow the resulting numpy lambda to accept array input (essentially it will use np.sin instead of math.sin and the same for cos).
I also changed the definition of the grid from array to np.linspace: this is usually more convenient. Since your function is almost constant at integer grid points, I created a denser mesh for plotting (51 points while keeping your original limits of (-10,10) fixed).
For clarity I included a few more plots: a contourf to show the value of the function (contour lines should always be orthogonal to the gradient vectors), and a colorbar to indicate the value of the function. Here's the result:
The composition is obviously not the best, but I didn't want to stray too much from your specifications. The arrows in this figure are actually hardly visible, but as you can see (and also evident from the definition of V) your function is periodic, so if you plot the same thing with smaller limits and less grid points, you'll see more features and larger arrows.

Plot function with large binomial coefficients

I would like to plot a function which involves binomial coefficients. The code I have is
#!/usr/bin/python
from __future__ import division
from scipy.special import binom
import matplotlib.pyplot as plt
import math
max = 500
ycoords = [sum([binom(n,w)*sum([binom(w,k)*(binom(w,k)/2**w)**(4*n/math.log(n)) for k in xrange(w+1)]) for w in xrange(1,n+1)]) for n in xrange(2,max)]
xcoords = range(2,max)
plt.plot(xcoords, ycoords)
plt.show()
Unfortunately this never terminates. If you reduce max to 40 say it works fine. Is there some way to plot this function?
I am also worried that scipy.special.binom might not be giving accurate answers as it works in floating point it seems.

You can get significant speedup by using numpy to compute the inner loop. First change max to N (since max is a builtin) and break up your function into smaller, more manageable chunks:
N = 500
X = np.arange(2,N)
def k_loop(w,n):
K = np.arange(0, w+1)
return (binom(w,K)*(binom(w,K)/2**w)**(float(n)/np.log(n))).sum()
def w_loop(n):
v = [binom(n,w)*k_loop(w,n) for w in range(1,n+1)]
return sum(v)
Y = [w_loop(n) for n in X]
Using N=300 as a test it takes 3.932s with the numpy code, but 81.645s using your old code. I didn't even time the N=500 case since your old code took so long!
It's worth pointing out that your function is basically exponential growth and can be approximated as such. You can see this in a semilogx plot:

Fourier transform of a Gaussian is not a Gaussian, but thats wrong! - Python

I am trying to utilize Numpy's fft function, however when I give the function a simple gausian function the fft of that gausian function is not a gausian, its close but its halved so that each half is at either end of the x axis.
The Gaussian function I'm calculating is
y = exp(-x^2)
Here is my code:
from cmath import *
from numpy import multiply
from numpy.fft import fft
from pylab import plot, show
""" Basically the standard range() function but with float support """
def frange (min_value, max_value, step):
value = float(min_value)
array = []
while value < float(max_value):
array.append(value)
value += float(step)
return array
N = 256.0 # number of steps
y = []
x = frange(-5, 5, 10/N)
# fill array y with values of the Gaussian function
cache = -multiply(x, x)
for i in cache: y.append(exp(i))
Y = fft(y)
# plot the fft of the gausian function
plot(x, abs(Y))
show()
The result is not quite right, cause the FFT of a Gaussian function should be a Gaussian function itself...

np.fft.fft returns a result in so-called "standard order": (from the docs)
If A = fft(a, n), then A[0]
contains the zero-frequency term (the
mean of the signal), which is always
purely real for real inputs. Then
A[1:n/2] contains the
positive-frequency terms, and
A[n/2+1:] contains the
negative-frequency terms, in order of
decreasingly negative frequency.
The function np.fft.fftshift rearranges the result into the order most humans expect (and which is good for plotting):
The routine np.fft.fftshift(A)
shifts transforms and their
frequencies to put the zero-frequency
components in the middle...
So using np.fft.fftshift:
import matplotlib.pyplot as plt
import numpy as np
N = 128
x = np.arange(-5, 5, 10./(2 * N))
y = np.exp(-x * x)
y_fft = np.fft.fftshift(np.abs(np.fft.fft(y))) / np.sqrt(len(y))
plt.plot(x,y)
plt.plot(x,y_fft)
plt.show()

Your result is not even close to a Gaussian, not even one split into two halves.
To get the result you expect, you will have to position your own Gaussian with the center at index 0, and the result will also be positioned that way. Try the following code:
from pylab import *
N = 128
x = r_[arange(0, 5, 5./N), arange(-5, 0, 5./N)]
y = exp(-x*x)
y_fft = fft(y) / sqrt(2 * N)
plot(r_[y[N:], y[:N]])
plot(r_[y_fft[N:], y_fft[:N]])
show()
The plot commands split the arrays in two halfs and swap them to get a nicer picture.

It is being displayed with the center (i.e. mean) at coefficient index zero. That is why it appears that the right half is on the left, and vice versa.
EDIT: Explore the following code:
import scipy
import scipy.signal as sig
import pylab
x = sig.gaussian(2048, 10)
X = scipy.absolute(scipy.fft(x))
pylab.plot(x)
pylab.plot(X)
pylab.plot(X[range(1024, 2048)+range(0, 1024)])
The last line will plot X starting from the center of the vector, then wrap around to the beginning.

A fourier transform implicitly repeats indefinitely, as it is a transform of a signal that implicitly repeats indefinitely. Note that when you pass y to be transformed, the x values are not supplied, so in fact the gaussian that is transformed is one centred on the median value between 0 and 256, so 128.
Remember also that translation of f(x) is phase change of F(x).

Following on from Sven Marnach's answer, a simpler version would be this:
from pylab import *
N = 128
x = ifftshift(arange(-5,5,5./N))
y = exp(-x*x)
y_fft = fft(y) / sqrt(2 * N)
plot(fftshift(y))
plot(fftshift(y_fft))
show()
This yields a plot identical to the above one.
The key (and this seems strange to me) is that NumPy's assumed data ordering --- in both frequency and time domains --- is to have the "zero" value first. This is not what I'd expect from other implementations of FFT, such as the FFTW3 libraries in C.
This was slightly fudged in the answers from unutbu and Steve Tjoa above, because they're taking the absolute value of the FFT before plotting it, thus wiping away the phase issues resulting from not using the "standard order" in time.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Recreating time series data using FFT results without using ifft - python

Related

Why is my code using 4th Runge-Kutta isn't giving me the expected values?

Graphing an equation subject to constraints

How to plot grad(f(x,y))?

Plot function with large binomial coefficients

Fourier transform of a Gaussian is not a Gaussian, but thats wrong! - Python

Categories

Resources