use scipy.integrate.simps or similar to integrate three vectors - python

I want to approximate a function that I do not have an actual analytical expression for. I know that I want to compute this integral: integral a * b * c dx. Pretend that I get the a, b, and c are from observed data. How can I evaluate this integral? Can scipy do this? Is scipy.integrate.simps the right approach?
import numpy as np
from scipy.integrate import simps
a = np.random.random(10)
b = np.random.uniform(0, 10, 10)
c = np.random.normal(2, .8, 10)
x = np.linspace(0, 1, 10)
dx = x[1] - x[0]
print 'Is the integral of a * b * dx is ', simps(a * b, c, dx), ', ', simps(b * a, c, dx), ',', simps(a, b * c, dx), ', ', simps(a, c * b, dx), ', or something else?'

With your setup, the correct way to integrate is either
simps(a*b*c, x) # function values, argument values
or
simps(a*b*c, dx=dx) # function values, uniform spacing between x-values
Both yield the same result. Yes, simps is a very good choice for integrating sampled data. Most of the time it is more accurate than trapz.
If the data comes from a smooth function (even though you don't know the function), and you can somehow make the number of points to be 1 more than a power of 2, then Romberg integration will be even better. I compared trapz vs simps vs quad in this post.

Related

Appending Numpy Array with Another Numpy Array

I have a NumPy array with equations solved symbolically, with constants a and b. Here's an example of the cell at index (2,0) in my array "bounds_symbolic":
-a*sqrt(1/(a**6*b**2+1))
I also have an array, called "a_values", that I would like to substitute into my "bounds_symbolic" array. I also have the b-value set to 1, which I would also like to substitute in. Keeping the top row of the arrays intact would also be nice.
In other words, for the cell indexed at (2,0) in "bounds_symbolic", I want to substitute all of my a and b-values into the equation, while extending the column to contain the substituted equations. I then want to do this operation for the entirety of the "bounds_symbolic" array.
Here is the code that I have so far:
import sympy
import numpy as np
a, b, x, y = sympy.symbols("a b x y")
# Equation of the ellipse solved for y
ellipse = sympy.sqrt((b ** 2) * (1 - ((x ** 2) / (a ** 2))))
# Functions to be tested
test_functions = np.array(
[(a * b * x), (((a * b) ** 2) * x), (((a * b) ** 3) * x), (((a * b) ** 4) * x), (((a * b) ** 5) * x)])
# Equating ellipse and test_functions so their intersection can be symbolically solved for
equate = np.array(
[sympy.Eq(ellipse, test_functions[0]), sympy.Eq(ellipse, test_functions[1]), sympy.Eq(ellipse, test_functions[2]),
sympy.Eq(ellipse, test_functions[3]), sympy.Eq(ellipse, test_functions[4])])
# Calculating the intersection points of the ellipse and the testing functions
# Array that holds the bounds of the integral solved symbolically
bounds_symbolic = np.array([])
for i in range(0, 5):
bounds_symbolic = np.append(bounds_symbolic, sympy.solve(equate[i], x))
# Array of a-values to plug into the bounds of the integral
a_values = np.array(np.linspace(-10, 10, 201))
# Setting b equal to a constant of 1
b = 1
integrand = np.array([])
for j in range(0, 5):
integrand = np.append(integrand, (ellipse - test_functions[j]))
# New array with a-values substituted into the bounds
bounds_a = bounds_symbolic
# for j in range(0, 5):
# bounds_a = np.append[:, ]
Thank you!
Numpy arrays are the best choice when working with pure numerical data, for which they can help speed up many types of calculations. Once you start mixing sympy expressions, things can get very messy. You'll also lose all the speed advantages of numpy arrays.
Apart from that, np.append is a very slow operation as it needs to recreate the complete array every time it is executed. When creating a new numpy array, the recommended way it to first create an empty array (e.g. with np.zeros()) already with its final size.
You should also check out Python's list comprehension as it eases the creation of lists. In "pythonic" code, indices are used as little as possible. List comprehension may look a bit weird when you are used to other programming languages, but you quickly get used to them, and from then on you'll certainly prefer them.
In your example code, numpy is useful for the np.linspace command, which creates an array of numbers (again converting them with np.array isn't necessary). And at the end, you might want to convert the substituted values to a numpy array. Note that this won't work when solve would return a different number of solutions for some of the equations, as numpy arrays need an equal size for all its elements. Also note that an explicit conversion from sympy's numerical type to a dtype understood by numpy might be needed. (Sympy often works with higher precision, not caring for the loss of speed.)
Also note that if you assign b = 1, you create a new variable and lose the variable pointing to the sympy symbol. It's recommended to use another name. Just writing b = 1 will not change the value of the symbol. You need subs to substitute symbols with values.
Summarizing, your code could look like this:
import sympy
import numpy as np
a, b, x, y = sympy.symbols("a b x y")
# Equation of the ellipse solved for y
ellipse = sympy.sqrt((b ** 2) * (1 - ((x ** 2) / (a ** 2))))
# Functions to be tested
test_functions = [a * b * x, ((a * b) ** 2) * x, ((a * b) ** 3) * x, ((a * b) ** 4) * x, ((a * b) ** 5) * x]
# Equating ellipse and test_functions so their intersection can be symbolically solved for
# Array that holds the bounds of the integral solved symbolically
bounds_symbolic = [sympy.solve(sympy.Eq(ellipse, fun), x) for fun in test_functions]
# Array of a-values to plug into the bounds of the integral
a_values = np.linspace(-10, 10, 201)
# Setting b equal to a constant of 1
b_val = 1
# New array with a-values substituted into the bounds
bounds_a = [[[bound.subs({a: a_val, b: b_val}) for bound in bounds]
for bounds in bounds_symbolic]
for a_val in a_values]
bounds_a = np.array(bounds_a, dtype='float') # shape: (201, 5, 2)
The values of the resulting array can for example be used for plotting:
import matplotlib.pyplot as plt
for i, (test_func, color) in enumerate(zip(test_functions, plt.cm.Set1.colors)):
plt.plot(a_values, bounds_a[:, i, 0], color=color, label=test_func)
plt.plot(a_values, bounds_a[:, i, 1], color=color, alpha=0.5)
plt.legend()
plt.margins(x=0)
plt.xlabel('a')
plt.ylabel('bounds')
plt.show()
Or filled:
for i, (test_func, color) in enumerate(zip(test_functions, plt.cm.Set1.colors)):
plt.plot(a_values, bounds_a[:, i, :], color=color)
plt.fill_between(a_values, bounds_a[:, i, 0], bounds_a[:, i, 1], color=color, alpha=0.1)

Multivariate curve-fitting in python for estimating the parameter and order of ellipse-like shapes

I'm trying to find the best parameters (a, b, and c) of the following function (general formula of circle, ellipse, or rhombus):
(|x|/a)^c + (|y|/b)^c = 1
of two arrays of independent data (x and y) in python. My main objective is to estimate the best value of (a, b, and c) based on my x and y variable. I am using curve_fit function from scipy, so here is my code with a demo x, and y.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
alpha = 5
beta = 3
N = 500
DIM = 2
np.random.seed(2)
theta = np.random.uniform(0, 2*np.pi, (N,1))
eps_noise = 0.2 * np.random.normal(size=[N,1])
circle = np.hstack([np.cos(theta), np.sin(theta)])
B = np.random.randint(-3, 3, (DIM, DIM))
noisy_ellipse = circle.dot(B) + eps_noise
X = noisy_ellipse[:,0:1]
Y = noisy_ellipse[:,1:]
def func(xdata, a, b,c):
x, y = xdata
return (np.abs(x)/a)**c + (np.abs(y)/b)**c
xdata = np.transpose(np.hstack((X, Y)))
ydata = np.ones((xdata.shape[1],))
pp, pcov = curve_fit(func, xdata, ydata, maxfev = 1000000, bounds=((0, 0, 1), (50, 50, 2)))
plt.scatter(X, Y, label='Data Points')
x_coord = np.linspace(-5,5,300)
y_coord = np.linspace(-5,5,300)
X_coord, Y_coord = np.meshgrid(x_coord, y_coord)
Z_coord = func((X_coord,Y_coord),pp[0],pp[1],pp[2])
plt.contour(X_coord, Y_coord, Z_coord, levels=[1], colors=('g'), linewidths=2)
plt.legend()
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
By using this code, the parameters are [4.69949891, 3.65493859, 1.0] for a, b, and c.
The problem is that I usually get the value of c the smallest in its bound, while in this demo data it (i.e., c parameter) supposes to be very close to 2 as the data represent an ellipse.
Any help and suggestions for solving this issue are appreciated.
A curve which equation is (|x/a|)^c + (|y/b|)^c = 1 is called "Superellipse" :
http://mathworld.wolfram.com/Superellipse.html
For large c the superellipse tends to a rectangular shape.
For c=2 the curve is an ellipse, or a circle in the particular case a=b.
For c close to 1 the superellipse tends to a rhombus shape.
For c larger than 0 and lower than 1 the superellipse looks like a (squashed) astroid with sharp vertices. This kind of shape will not be considered below.
Before looking to the right question of the OP, it is of interest to study the regression behaviour for fitting a superellipse to scattered data. A short experimental and simplified approach tends to make understand the mathematical difficulty, prior the programming difficulties.
When the scatter increases the computed value of c (corresponding to the minimum of MSE ) decreases. Also the minimum becomes more and more difficult to localize. This is certainly a difficulty for the softwares.
For even larger scatter the value of c=1 leads to a rhombus shape.
So, it is not surprizing that in the example highly scattered published by the OP the software gave a rhombus as fitted curve.
If this was not the expected result, one have to chose another goal than the minimum MSE. For example if the goal is to obtain an elliptic shape, one have to set c=2. The result on the next figure shows that the MSE is worse than with the preceeding rhombus shape. But the elliptic fitting is well achieved.
NOTE : In case of large scatter the result depends a lot from the choice of criteria of fitting (MSE, MAE, ..., and with respect to what variable). This can be the cause of very different results from a software to another if the criterias of fitting (sometime not explicit) are different.
Among the criterias of fitting, if it is specified that the rhombus shape is excluded, one have to define more representative criteria and/or model and implement them in the software.
IMPORTANCE OF CRITERIA OF FITTING :
In order to show how the choice of criteria of fitting is important especially in case of data highly scattered, we will make the study again with a different criteria.
Instead of the preceeding criteria which was the MSE of the errors on the superellipse equation itself, that was :
we chose a different criteria, for example the MSE of the errors on the radial coordinate in polar system :
The notations are defined on the next picture :
Some results from the empirical study for increasing scatter :
We observe that the numerical calculus with the second criteria is more robust that with the first. Cases with higher scatter can be treated With the second criteria of fitting .
The drawback it that this second criteria is probably not considered in the available softwares. So one have to implement the above formulas in the existing software if possible. Or to write a software especially adapted.
Nevertheless this discussion about criteria of fitting is somehow out of subject because the criteria of fitting should not result from mathematical considerations only. If the problem comes from a practical need in physic or technology the criteria of fitting might be derived from the reality without choice.
I have modified your code (though you took it from https://stackoverflow.com/a/47881806/10640534) quite a lot, but I think I have what you expect. I am using a different equation, which I found here. I have also used the new Numpy random generators, but I believe that is only aesthetic for this problem. I am drawing the ellipse using patches from matplotlib, which indeed is aesthetic, but definitely a way better solution to represent your conic. Importantly, I am using the dogbox method for curve_fit because other methods do not converge; occasionally the ellipse is not matched and decreasing the added noise (e.g., rng.normal(0, 1, (500, 2)) / 1e2 instead of rng.normal(0, 1, (500, 2)) / 1e1 helps). Anyway, snippet and figure below.
import numpy as np
from numpy.random import default_rng
from matplotlib.patches import Ellipse
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(data, a, b, h, k, A):
x, y = data
return ((((x - h) * np.cos(A) + (y - k) * np.sin(A)) / a) ** 2
+ (((x - h) * np.sin(A) - (y - k) * np.cos(A)) / b) ** 2)
rng = default_rng(3)
numPoints = 500
center = rng.random(2) * 10 - 5
theta = rng.uniform(0, 2 * np.pi, (numPoints, 1))
circle = np.hstack([np.cos(theta), np.sin(theta)])
ellipse = (circle.dot(rng.random((2, 2)) * 2 * np.pi - np.pi)
+ (center[0], center[1]) + rng.normal(0, 1, (500, 2)) / 1e1)
pp, pcov = curve_fit(func, (ellipse[:, 0], ellipse[:, 1]), np.ones(numPoints),
p0=(1, 1, center[0], center[1], np.pi / 2),
method='dogbox')
plt.scatter(ellipse[:, 0], ellipse[:, 1], label='Data Points')
plt.gca().add_patch(Ellipse(xy=(pp[2], pp[3]), width=2 * pp[0],
height=2 * pp[1], angle=pp[4] * 180 / np.pi,
fill=False))
plt.gca().set_aspect('equal')
plt.tight_layout()
plt.show()
To incorporate the value of the exponent, I have used your equation and generated an ellipse according to this answer. This results in:
import numpy as np
from numpy.random import default_rng
from matplotlib.patches import Ellipse
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit, root
from scipy.special import ellipeinc
def angles_in_ellipse(num, a, b):
assert(num > 0)
assert(a < b)
angles = 2 * np.pi * np.arange(num) / num
if a != b:
e = (1.0 - a ** 2.0 / b ** 2.0) ** 0.5
tot_size = ellipeinc(2.0 * np.pi, e)
arc_size = tot_size / num
arcs = np.arange(num) * arc_size
res = root(lambda x: (ellipeinc(x, e) - arcs), angles)
angles = res.x
return angles
def func(data, a, b, c):
x, y = data
return (np.absolute(x) / a) ** c + (np.absolute(y) / b) ** c
a = 10
b = 20
n = 100
phi = angles_in_ellipse(n, a, b)
e = (1.0 - a ** 2.0 / b ** 2.0) ** 0.5
arcs = ellipeinc(phi, e)
noise = default_rng(0).normal(0, 1, n) / 2
pp, pcov = curve_fit(func, (b * np.sin(phi) + noise,
a * np.cos(phi) + noise),
np.ones(n), method='lm')
plt.scatter(b * np.sin(phi) + noise, a * np.cos(phi) + noise,
label='Data Points')
plt.gca().add_patch(Ellipse(xy=(0, 0), width=2 * pp[0], height=2 * pp[1],
angle=0, fill=False))
plt.gca().set_aspect('equal')
plt.tight_layout()
plt.show()
As you decrease noise values, pp will tend to (b, a, 2).

Fit bipolar sigmoid python

I'm stuck trying to fit a bipolar sigmoid curve - I'd like to have the following curve:
but I need it shifted and stretched. I have the following inputs:
x[0] = 8, x[48] = 2
So over 48 periods I need to drop from 8 to 2 using a bipolar sigmoid function to approximate a nice smooth dropoff. Any ideas how I could derive the curve that would fit those parameters?
Here's what I have so far, but I need to change the sigmoid function:
import math
def sigmoid(x):
return 1 / (1 + math.exp(-x))
plt.plot([sigmoid(float(z)) for z in range(1,48)])
You could redefine the sigmoid function like so
def sigmoid(x, a, b, c, d):
""" General sigmoid function
a adjusts amplitude
b adjusts y offset
c adjusts x offset
d adjusts slope """
y = ((a-b) / (1 + np.exp(x-(c/2))**d)) + b
return y
x = np.arange(49)
y = sigmoid(x, 8, 2, 48, 0.3)
plt.plot(x, y)
Severin's answer is likely more robust, but this should be fine if all you want is a quick and dirty solution.
In [2]: y[0]
Out[2]: 7.9955238269969806
In [3]: y[48]
Out[3]: 2.0044761730030203
From generic bipolar sigmoid function:
f(x,m,b)= 2/(1+exp(-b*(x-m))) - 1
there are two parameters and two unknowns - shift m and scale b
You have two condition:f(0) = 8, f(48) = 2
take first condition, express b vs m, together with second condition write non-linear function to solve, and then use fsolve from SciPy to solve it numerically, and recover back b and m.
Here related by similar method question and answer: How to random sample lognormal data in Python using the inverse CDF and specify target percentiles?
Alternatively, you could also use curve_fit which might come in handy if you have more than just two datapoints. The output looks like this:
As you can see, the graph contains the desired data points. I used #lanery's function for the fit; you can of course choose any function you like. This is the code with some inline comments:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def sigmoid(x, a, b, c, d):
return ((a - b) / (1. + np.exp(x - (c / 2)) ** d)) + b
# one needs at least as many data points as parameters, so I just duplicate the data
xdata = [0., 48.] * 2
ydata = [8., 2.] * 2
# plot data
plt.plot(xdata, ydata, 'bo', label='data')
# fit the data
popt, pcov = curve_fit(sigmoid, xdata, ydata, p0=[1., 1., 50., 0.5])
# plot the result
xdata_new = np.linspace(0, 50, 100)
plt.plot(xdata_new, sigmoid(xdata_new, *popt), 'r-', label='fit')
plt.legend(loc='best')
plt.show()

Add constraints to scipy.optimize.curve_fit?

I have the option to add bounds to sio.curve_fit. Is there a way to expand upon this bounds feature that involves a function of the parameters? In other words, say I have an arbitrary function with two or more unknown constants. And then let's also say that I know the sum of all of these constants is less than 10. Is there a way I can implement this last constraint?
import numpy as np
import scipy.optimize as sio
def f(x, a, b, c):
return a*x**2 + b*x + c
x = np.linspace(0, 100, 101)
y = 2*x**2 + 3*x + 4
popt, pcov = sio.curve_fit(f, x, y, \
bounds = [(0, 0, 0), (10 - b - c, 10 - a - c, 10 - a - b)]) # a + b + c < 10
Now, this would obviously error, but I think it helps to get the point across. Is there a way I can incorporate a constraint function involving the parameters to a curve fit?
Thanks!
With lmfit, you would define 4 parameters (a, b, c, and delta). a and b can vary freely. delta is allowed to vary, but has a maximum value of 10 to represent the inequality. c would be constrained to be delta-a-b (so, there are still 3 variables: c will vary, but not independently from the others). If desired, you could also put bounds on the values for a, b, and c. Without testing, your code would be approximately::
import numpy as np
from lmfit import Model, Parameters
def f(x, a, b, c):
return a*x**2 + b*x + c
x = np.linspace(0, 100.0, 101)
y = 2*x**2 + 3*x + 4.0
fmodel = Model(f)
params = Parameters()
params.add('a', value=1, vary=True)
params.add('b', value=4, vary=True)
params.add('delta', value=5, vary=True, max=10)
params.add('c', expr = 'delta - a - b')
result = fmodel.fit(y, params, x=x)
print(result.fit_report())
Note that if you actually get to a situation where the constraint expression or bounds dictate the values for the parameters, uncertainties may not be estimated.
curve_fit and least_squares only accept box constraints. In scipy.optimize, SLSQP can deal with more complicated constraints.
For curve fitting specifically, you can have a look at lmfit package.

How to calculate a Fourier series in Numpy?

I have a periodic function of period T and would like to know how to obtain the list of the Fourier coefficients. I tried using fft module from numpy but it seems more dedicated to Fourier transforms than series.
Maybe it a lack of mathematical knowledge, but I can't see how to calculate the Fourier coefficients from fft.
Help and/or examples appreciated.
In the end, the most simple thing (calculating the coefficient with a riemann sum) was the most portable/efficient/robust way to solve my problem:
import numpy as np
def cn(n):
c = y*np.exp(-1j*2*n*np.pi*time/period)
return c.sum()/c.size
def f(x, Nh):
f = np.array([2*cn(i)*np.exp(1j*2*i*np.pi*x/period) for i in range(1,Nh+1)])
return f.sum()
y2 = np.array([f(t,50).real for t in time])
plot(time, y)
plot(time, y2)
gives me:
This is an old question, but since I had to code this, I am posting here the solution that uses the numpy.fft module, that is likely faster than other hand-crafted solutions.
The DFT is the right tool for the job of calculating up to numerical precision the coefficients of the Fourier series of a function, defined as an analytic expression of the argument or as a numerical interpolating function over some discrete points.
This is the implementation, which allows to calculate the real-valued coefficients of the Fourier series, or the complex valued coefficients, by passing an appropriate return_complex:
def fourier_series_coeff_numpy(f, T, N, return_complex=False):
"""Calculates the first 2*N+1 Fourier series coeff. of a periodic function.
Given a periodic, function f(t) with period T, this function returns the
coefficients a0, {a1,a2,...},{b1,b2,...} such that:
f(t) ~= a0/2+ sum_{k=1}^{N} ( a_k*cos(2*pi*k*t/T) + b_k*sin(2*pi*k*t/T) )
If return_complex is set to True, it returns instead the coefficients
{c0,c1,c2,...}
such that:
f(t) ~= sum_{k=-N}^{N} c_k * exp(i*2*pi*k*t/T)
where we define c_{-n} = complex_conjugate(c_{n})
Refer to wikipedia for the relation between the real-valued and complex
valued coeffs at http://en.wikipedia.org/wiki/Fourier_series.
Parameters
----------
f : the periodic function, a callable like f(t)
T : the period of the function f, so that f(0)==f(T)
N_max : the function will return the first N_max + 1 Fourier coeff.
Returns
-------
if return_complex == False, the function returns:
a0 : float
a,b : numpy float arrays describing respectively the cosine and sine coeff.
if return_complex == True, the function returns:
c : numpy 1-dimensional complex-valued array of size N+1
"""
# From Shanon theoreom we must use a sampling freq. larger than the maximum
# frequency you want to catch in the signal.
f_sample = 2 * N
# we also need to use an integer sampling frequency, or the
# points will not be equispaced between 0 and 1. We then add +2 to f_sample
t, dt = np.linspace(0, T, f_sample + 2, endpoint=False, retstep=True)
y = np.fft.rfft(f(t)) / t.size
if return_complex:
return y
else:
y *= 2
return y[0].real, y[1:-1].real, -y[1:-1].imag
This is an example of usage:
from numpy import ones_like, cos, pi, sin, allclose
T = 1.5 # any real number
def f(t):
"""example of periodic function in [0,T]"""
n1, n2, n3 = 1., 4., 7. # in Hz, or nondimensional for the matter.
a0, a1, b4, a7 = 4., 2., -1., -3
return a0 / 2 * ones_like(t) + a1 * cos(2 * pi * n1 * t / T) + b4 * sin(
2 * pi * n2 * t / T) + a7 * cos(2 * pi * n3 * t / T)
N_chosen = 10
a0, a, b = fourier_series_coeff_numpy(f, T, N_chosen)
# we have as expected that
assert allclose(a0, 4)
assert allclose(a, [2, 0, 0, 0, 0, 0, -3, 0, 0, 0])
assert allclose(b, [0, 0, 0, -1, 0, 0, 0, 0, 0, 0])
And the plot of the resulting a0,a1,...,a10,b1,b2,...,b10 coefficients:
This is an optional test for the function, for both modes of operation. You should run this after the example, or define a periodic function f and a period T before running the code.
# #### test that it works with real coefficients:
from numpy import linspace, allclose, cos, sin, ones_like, exp, pi, \
complex64, zeros
def series_real_coeff(a0, a, b, t, T):
"""calculates the Fourier series with period T at times t,
from the real coeff. a0,a,b"""
tmp = ones_like(t) * a0 / 2.
for k, (ak, bk) in enumerate(zip(a, b)):
tmp += ak * cos(2 * pi * (k + 1) * t / T) + bk * sin(
2 * pi * (k + 1) * t / T)
return tmp
t = linspace(0, T, 100)
f_values = f(t)
a0, a, b = fourier_series_coeff_numpy(f, T, 52)
# construct the series:
f_series_values = series_real_coeff(a0, a, b, t, T)
# check that the series and the original function match to numerical precision:
assert allclose(f_series_values, f_values, atol=1e-6)
# #### test similarly that it works with complex coefficients:
def series_complex_coeff(c, t, T):
"""calculates the Fourier series with period T at times t,
from the complex coeff. c"""
tmp = zeros((t.size), dtype=complex64)
for k, ck in enumerate(c):
# sum from 0 to +N
tmp += ck * exp(2j * pi * k * t / T)
# sum from -N to -1
if k != 0:
tmp += ck.conjugate() * exp(-2j * pi * k * t / T)
return tmp.real
f_values = f(t)
c = fourier_series_coeff_numpy(f, T, 7, return_complex=True)
f_series_values = series_complex_coeff(c, t, T)
assert allclose(f_series_values, f_values, atol=1e-6)
Numpy isn't the right tool really to calculate fourier series components, as your data has to be discretely sampled. You really want to use something like Mathematica or should be using fourier transforms.
To roughly do it, let's look at something simple a triangle wave of period 2pi, where we can easily calculate the Fourier coefficients (c_n = -i ((-1)^(n+1))/n for n>0; e.g., c_n = { -i, i/2, -i/3, i/4, -i/5, i/6, ... } for n=1,2,3,4,5,6 (using Sum( c_n exp(i 2 pi n x) ) as Fourier series).
import numpy
x = numpy.arange(0,2*numpy.pi, numpy.pi/1000)
y = (x+numpy.pi/2) % numpy.pi - numpy.pi/2
fourier_trans = numpy.fft.rfft(y)/1000
If you look at the first several Fourier components:
array([ -3.14159265e-03 +0.00000000e+00j,
2.54994550e-16 -1.49956612e-16j,
3.14159265e-03 -9.99996710e-01j,
1.28143395e-16 +2.05163971e-16j,
-3.14159265e-03 +4.99993420e-01j,
5.28320925e-17 -2.74568926e-17j,
3.14159265e-03 -3.33323464e-01j,
7.73558750e-17 -3.41761974e-16j,
-3.14159265e-03 +2.49986840e-01j,
1.73758496e-16 +1.55882418e-17j,
3.14159265e-03 -1.99983550e-01j,
-1.74044469e-16 -1.22437710e-17j,
-3.14159265e-03 +1.66646927e-01j,
-1.02291982e-16 -2.05092972e-16j,
3.14159265e-03 -1.42834113e-01j,
1.96729377e-17 +5.35550532e-17j,
-3.14159265e-03 +1.24973680e-01j,
-7.50516717e-17 +3.33475329e-17j,
3.14159265e-03 -1.11081501e-01j,
-1.27900121e-16 -3.32193126e-17j,
-3.14159265e-03 +9.99670992e-02j,
First neglect the components that are near 0 due to floating point accuracy (~1e-16, as being zero). The more difficult part is to see that the 3.14159 numbers (that arose before we divide by the period of a 1000) should also be recognized as zero, as the function is periodic). So if we neglect those two factors we get:
fourier_trans = [0,0,-i,0,i/2,0,-i/3,0,i/4,0,-i/5,0,-i/6, ...
and you can see the fourier series numbers come up as every other number (I haven't investigated; but I believe the components correspond to [c0, c-1, c1, c-2, c2, ... ]). I'm using conventions according to wiki: http://en.wikipedia.org/wiki/Fourier_series.
Again, I'd suggest using mathematica or a computer algebra system capable of integrating and dealing with continuous functions.
As other answers have mentioned, it seems that what you are looking for is a symbolic computing package, so numpy isn't suitable. If you wish to use a free python-based solution, then either sympy or sage should meet your needs.
Do you have a list of discrete samples of your function, or is your function itself discrete? If so, the Discrete Fourier Transform, calculated using an FFT algorithm, provides the Fourier coefficients directly (see here).
On the other hand, if you have an analytic expression for the function, you probably need a symbolic math solver of some kind.

Categories

Resources