I am trying to apply numpy to this code I wrote for trapezium rule integration:
def integral(a,b,n):
delta = (b-a)/float(n)
s = 0.0
s+= np.sin(a)/(a*2)
for i in range(1,n):
s +=np.sin(a + i*delta)/(a + i*delta)
s += np.sin(b)/(b*2.0)
return s * delta
I am trying to get the return value from the new function something like this:
return delta *((2 *np.sin(x[1:-1])) +np.sin(x[0])+np.sin(x[-1]) )/2*x
I am trying for a long time now to make any breakthrough but all my attempts failed.
One of the things I attempted and I do not get is why the following code gives too many indices for array error?
def integral(a,b,n):
d = (b-a)/float(n)
x = np.arange(a,b,d)
J = np.where(x[:,1] < np.sin(x[:,0])/x[:,0])[0]
Every hint/advice is very much appreciated.
You forgot to sum over sin(x):
>>> def integral(a, b, n):
... x, delta = np.linspace(a, b, n+1, retstep=True)
... y = np.sin(x)
... y[0] /= 2
... y[-1] /= 2
... return delta * y.sum()
...
>>> integral(0, np.pi / 2, 10000)
0.9999999979438324
>>> integral(0, 2 * np.pi, 10000)
0.0
>>> from scipy.integrate import quad
>>> quad(np.sin, 0, np.pi / 2)
(0.9999999999999999, 1.1102230246251564e-14)
>>> quad(np.sin, 0, 2 * np.pi)
(2.221501482512777e-16, 4.3998892617845996e-14)
I tried this meanwhile, too.
import numpy as np
def T_n(a, b, n, fun):
delta = (b - a)/float(n) # delta formula
x_i = lambda a,i,delta: a + i * delta # calculate x_i
return 0.5 * delta * \
(2 * sum(fun(x_i(a, np.arange(0, n + 1), delta))) \
- fun(x_i(a, 0, delta)) \
- fun(x_i(a, n, delta)))
Reconstructed the code using formulas at bottom of this page
https://matheguru.com/integralrechnung/trapezregel.html
The summing over the range(0, n+1) - which gives [0, 1, ..., n] -
is implemented using numpy. Usually, you would collect the values using a for loop in normal Python.
But numpy's vectorized behaviour can be used here.
np.arange(0, n+1) gives a np.array([0, 1, ...,n]).
If given as argument to the function (here abstracted as fun) - the function formula for x_0 to x_n
will be then calculated. and collected in a numpy-array. So fun(x_i(...)) returns a numpy-array of the function applied on x_0 to x_n. This array/list is summed up by sum().
The entire sum() is multiplied by 2, and then the function value of x_0 and x_n subtracted afterwards. (Since in the trapezoid formula only the middle summands, but not the first and the last, are multiplied by 2). This was kind of a hack.
The linked German page uses as a function fun(x) = x ^ 2 + 3
which can be nicely defined on the fly by using a lambda expression:
fun = lambda x: x ** 2 + 3
a = -2
b = 3
n = 6
You could instead use a normal function definition, too: defun fun(x): return x ** 2 + 3.
So I tested by typing the command:
T_n(a, b, n, fun)
Which correctly returned:
## Out[172]: 27.24537037037037
For your case, just allocate np.sin tofun and your values for a, b, and n into this function call.
Like:
fun = np.sin # by that eveywhere where `fun` is placed in function,
# it will behave as if `np.sin` will stand there - this is possible,
# because Python treats its functions as first class citizens
a = #your value
b = #your value
n = #your value
Finally, you can call:
T_n(a, b, n, fun)
And it will work!
I'm trying to solve an integral equation using the following code (irrelevant parts removed):
def _pdf(self, a, b, c, t):
pdf = some_pdf(a,b,c,t)
return pdf
def _result(self, a, b, c, flag):
return fsolve(lambda t: flag - 1 + quad(lambda tau: self._pdf(a, b, c, tau), 0, t)[0], x0)[0]
Which takes a probability density function and finds a result tau such that the integral of pdf from tau to infinity is equal to flag. Note that x0 is a (float) estimate of the root defined elsewhere in the script. Also note that flag is an extremely small number, on the order of 1e-9.
In my application fsolve only successfully finds a root about 50% of the time. It often just returns x0, significantly biasing my results. There is no closed form for the integral of pdf, so I am forced to integrate numerically and feel that this might be introducing some inaccuracy?
EDIT:
This has since been solved using a method other than that described below, but I'd like to get quadpy to work and see if the results improve at all. The specific code I'm trying to get to work is as follows:
import quadpy
import numpy as np
from scipy.optimize import *
from scipy.special import gammaln, kv, gammaincinv, gamma
from scipy.integrate import quad, simps
l = 226.02453163
mu = 0.00212571582056
nu = 4.86569872444
flag = 2.5e-09
estimate = 3 * mu
def pdf(l, mu, nu, t):
return np.exp(np.log(2) + (l + nu - 1 + 1) / 2 * np.log(l * nu / mu) + (l + nu - 1 - 1) / 2 * np.log(t) + np.log(
kv(nu - l, 2 * np.sqrt(l * nu / mu * t))) - gammaln(l) - gammaln(nu))
def tail_cdf(l, mu, nu, tau):
i, error = quadpy.line_segment.adaptive_integrate(
lambda t: pdf(l, mu, nu, t), [tau, 10000], 1.0e-10
)
return i
result = fsolve(lambda tau: flag - tail_cdf(l, mu, nu, tau[0]), estimate)
When I run this I get an assertion error from assert all(lengths > minimum_interval_length). I'm not quite sure of how to remedy this; any help would be very much appreciated!
As an example, I tried 1 / x for the integration between 1 and alpha to retrieve the target integral 2.0. This
import quadpy
from scipy.optimize import fsolve
def f(alpha):
beta, _ = quadpy.quad(lambda x: 1.0/x, 1, alpha)
return beta
target = 2.0
res = fsolve(lambda alpha: target - f(alpha), x0=2.0)
print(res)
correctly returns 7.38905611.
The failing quadpy assertion
assert all(lengths > minimum_interval_length)
you're getting means that the adaptive integration hit its limit: Either relax your tolerance a bit, or decrease the minimum_interval_length (see here).
I'm implementing Bayesian Changepoint Detection in Python/NumPy (if you are interested have a look at the paper). I need to compute likelihoods for data in ranges [a, b], where a and b can have all values from 1 to n. However I can prune the computation at some points, so that I don't have to compute every likelihood. On the other hand some likelihoods are used more than once, so that I can save time by saving the values in a matrix P[a, b]. Right now I check whether the value is already computed, whenever I use it, but I find that a bit of a hassle. It looks like this:
# ...
P = np.ones((n, n)) * np.inf # a likelihood can't get inf, so I use it
# as pseudo value
for a in range(n):
for b in range(a, n):
# The following two lines get annoying and error prone if you
# use P more than once
if P[a, b] == np.inf:
P[a, b] = likelihood(data, a, b)
Q[a] += P[a, b] * g[a] * Q[a - 1] # some computation using P[a, b]
# ...
I wonder, whether there is a more intuitive and pythonic way to achieve this, without having the if ... statement before every use of a P[a, b]. Something like an automagical function call if some condition is not met. I could of course make the likelihood function aware of the fact that it could save values, but then it needs some kind of state (e.g. becomes an object). I want to avoid that.
The likelihood function
Since it was asked for in a comment, I add the likelihood function. It actually computes the conjugate prior and then the likelihood. And all in log representation... So it is quite complicated.
from scipy.special import gammaln
def gaussian_obs_log_likelihood(data, t, s):
n = s - t
mean = data[t:s].sum() / n
muT = (n * mean) / (1 + n)
nuT = 1 + n
alphaT = 1 + n / 2
betaT = 1 + 0.5 * ((data[t:s] - mean) ** 2).sum() + ((n)/(1 + n)) * (mean**2 / 2)
scale = (betaT*(nuT + 1))/(alphaT * nuT)
# splitting the PDF of the student distribution up is /much/ faster. (~ factor 20)
prob = 1
for yi in data[t:s]:
prob += np.log(1 + (yi - muT)**2/(nuT * scale))
lgA = gammaln((nuT + 1) / 2) - np.log(np.sqrt(np.pi * nuT * scale)) - gammaln(nuT/2)
return n * lgA - (nuT + 1)/2 * prob
Although I work with Python 2.7, both answers for 2.7 and 3.x are appreciated.
I would use a sibling of defaultdict for this (you can't use defaultdict directly since it won't tell you the key that is missing):
class Cache(object):
def __init__(self):
self.cache = {}
def get(self, a, b):
key = (a,b)
result = self.cache.get(key, None)
if result is None:
result = likelihood(data, a, b)
self.cache[key] = result
return result
Another approach would be using a cache decorator on likelihood as described here.
I have a periodic function of period T and would like to know how to obtain the list of the Fourier coefficients. I tried using fft module from numpy but it seems more dedicated to Fourier transforms than series.
Maybe it a lack of mathematical knowledge, but I can't see how to calculate the Fourier coefficients from fft.
Help and/or examples appreciated.
In the end, the most simple thing (calculating the coefficient with a riemann sum) was the most portable/efficient/robust way to solve my problem:
import numpy as np
def cn(n):
c = y*np.exp(-1j*2*n*np.pi*time/period)
return c.sum()/c.size
def f(x, Nh):
f = np.array([2*cn(i)*np.exp(1j*2*i*np.pi*x/period) for i in range(1,Nh+1)])
return f.sum()
y2 = np.array([f(t,50).real for t in time])
plot(time, y)
plot(time, y2)
gives me:
This is an old question, but since I had to code this, I am posting here the solution that uses the numpy.fft module, that is likely faster than other hand-crafted solutions.
The DFT is the right tool for the job of calculating up to numerical precision the coefficients of the Fourier series of a function, defined as an analytic expression of the argument or as a numerical interpolating function over some discrete points.
This is the implementation, which allows to calculate the real-valued coefficients of the Fourier series, or the complex valued coefficients, by passing an appropriate return_complex:
def fourier_series_coeff_numpy(f, T, N, return_complex=False):
"""Calculates the first 2*N+1 Fourier series coeff. of a periodic function.
Given a periodic, function f(t) with period T, this function returns the
coefficients a0, {a1,a2,...},{b1,b2,...} such that:
f(t) ~= a0/2+ sum_{k=1}^{N} ( a_k*cos(2*pi*k*t/T) + b_k*sin(2*pi*k*t/T) )
If return_complex is set to True, it returns instead the coefficients
{c0,c1,c2,...}
such that:
f(t) ~= sum_{k=-N}^{N} c_k * exp(i*2*pi*k*t/T)
where we define c_{-n} = complex_conjugate(c_{n})
Refer to wikipedia for the relation between the real-valued and complex
valued coeffs at http://en.wikipedia.org/wiki/Fourier_series.
Parameters
----------
f : the periodic function, a callable like f(t)
T : the period of the function f, so that f(0)==f(T)
N_max : the function will return the first N_max + 1 Fourier coeff.
Returns
-------
if return_complex == False, the function returns:
a0 : float
a,b : numpy float arrays describing respectively the cosine and sine coeff.
if return_complex == True, the function returns:
c : numpy 1-dimensional complex-valued array of size N+1
"""
# From Shanon theoreom we must use a sampling freq. larger than the maximum
# frequency you want to catch in the signal.
f_sample = 2 * N
# we also need to use an integer sampling frequency, or the
# points will not be equispaced between 0 and 1. We then add +2 to f_sample
t, dt = np.linspace(0, T, f_sample + 2, endpoint=False, retstep=True)
y = np.fft.rfft(f(t)) / t.size
if return_complex:
return y
else:
y *= 2
return y[0].real, y[1:-1].real, -y[1:-1].imag
This is an example of usage:
from numpy import ones_like, cos, pi, sin, allclose
T = 1.5 # any real number
def f(t):
"""example of periodic function in [0,T]"""
n1, n2, n3 = 1., 4., 7. # in Hz, or nondimensional for the matter.
a0, a1, b4, a7 = 4., 2., -1., -3
return a0 / 2 * ones_like(t) + a1 * cos(2 * pi * n1 * t / T) + b4 * sin(
2 * pi * n2 * t / T) + a7 * cos(2 * pi * n3 * t / T)
N_chosen = 10
a0, a, b = fourier_series_coeff_numpy(f, T, N_chosen)
# we have as expected that
assert allclose(a0, 4)
assert allclose(a, [2, 0, 0, 0, 0, 0, -3, 0, 0, 0])
assert allclose(b, [0, 0, 0, -1, 0, 0, 0, 0, 0, 0])
And the plot of the resulting a0,a1,...,a10,b1,b2,...,b10 coefficients:
This is an optional test for the function, for both modes of operation. You should run this after the example, or define a periodic function f and a period T before running the code.
# #### test that it works with real coefficients:
from numpy import linspace, allclose, cos, sin, ones_like, exp, pi, \
complex64, zeros
def series_real_coeff(a0, a, b, t, T):
"""calculates the Fourier series with period T at times t,
from the real coeff. a0,a,b"""
tmp = ones_like(t) * a0 / 2.
for k, (ak, bk) in enumerate(zip(a, b)):
tmp += ak * cos(2 * pi * (k + 1) * t / T) + bk * sin(
2 * pi * (k + 1) * t / T)
return tmp
t = linspace(0, T, 100)
f_values = f(t)
a0, a, b = fourier_series_coeff_numpy(f, T, 52)
# construct the series:
f_series_values = series_real_coeff(a0, a, b, t, T)
# check that the series and the original function match to numerical precision:
assert allclose(f_series_values, f_values, atol=1e-6)
# #### test similarly that it works with complex coefficients:
def series_complex_coeff(c, t, T):
"""calculates the Fourier series with period T at times t,
from the complex coeff. c"""
tmp = zeros((t.size), dtype=complex64)
for k, ck in enumerate(c):
# sum from 0 to +N
tmp += ck * exp(2j * pi * k * t / T)
# sum from -N to -1
if k != 0:
tmp += ck.conjugate() * exp(-2j * pi * k * t / T)
return tmp.real
f_values = f(t)
c = fourier_series_coeff_numpy(f, T, 7, return_complex=True)
f_series_values = series_complex_coeff(c, t, T)
assert allclose(f_series_values, f_values, atol=1e-6)
Numpy isn't the right tool really to calculate fourier series components, as your data has to be discretely sampled. You really want to use something like Mathematica or should be using fourier transforms.
To roughly do it, let's look at something simple a triangle wave of period 2pi, where we can easily calculate the Fourier coefficients (c_n = -i ((-1)^(n+1))/n for n>0; e.g., c_n = { -i, i/2, -i/3, i/4, -i/5, i/6, ... } for n=1,2,3,4,5,6 (using Sum( c_n exp(i 2 pi n x) ) as Fourier series).
import numpy
x = numpy.arange(0,2*numpy.pi, numpy.pi/1000)
y = (x+numpy.pi/2) % numpy.pi - numpy.pi/2
fourier_trans = numpy.fft.rfft(y)/1000
If you look at the first several Fourier components:
array([ -3.14159265e-03 +0.00000000e+00j,
2.54994550e-16 -1.49956612e-16j,
3.14159265e-03 -9.99996710e-01j,
1.28143395e-16 +2.05163971e-16j,
-3.14159265e-03 +4.99993420e-01j,
5.28320925e-17 -2.74568926e-17j,
3.14159265e-03 -3.33323464e-01j,
7.73558750e-17 -3.41761974e-16j,
-3.14159265e-03 +2.49986840e-01j,
1.73758496e-16 +1.55882418e-17j,
3.14159265e-03 -1.99983550e-01j,
-1.74044469e-16 -1.22437710e-17j,
-3.14159265e-03 +1.66646927e-01j,
-1.02291982e-16 -2.05092972e-16j,
3.14159265e-03 -1.42834113e-01j,
1.96729377e-17 +5.35550532e-17j,
-3.14159265e-03 +1.24973680e-01j,
-7.50516717e-17 +3.33475329e-17j,
3.14159265e-03 -1.11081501e-01j,
-1.27900121e-16 -3.32193126e-17j,
-3.14159265e-03 +9.99670992e-02j,
First neglect the components that are near 0 due to floating point accuracy (~1e-16, as being zero). The more difficult part is to see that the 3.14159 numbers (that arose before we divide by the period of a 1000) should also be recognized as zero, as the function is periodic). So if we neglect those two factors we get:
fourier_trans = [0,0,-i,0,i/2,0,-i/3,0,i/4,0,-i/5,0,-i/6, ...
and you can see the fourier series numbers come up as every other number (I haven't investigated; but I believe the components correspond to [c0, c-1, c1, c-2, c2, ... ]). I'm using conventions according to wiki: http://en.wikipedia.org/wiki/Fourier_series.
Again, I'd suggest using mathematica or a computer algebra system capable of integrating and dealing with continuous functions.
As other answers have mentioned, it seems that what you are looking for is a symbolic computing package, so numpy isn't suitable. If you wish to use a free python-based solution, then either sympy or sage should meet your needs.
Do you have a list of discrete samples of your function, or is your function itself discrete? If so, the Discrete Fourier Transform, calculated using an FFT algorithm, provides the Fourier coefficients directly (see here).
On the other hand, if you have an analytic expression for the function, you probably need a symbolic math solver of some kind.