Are there any inherent limitations to the scipy.integrate.quad function? - python

I am currently attempting to perform a definite integral of a gaussian function and I am receiving an answer of 0 when I am convinced that is not the case.
This leads me to ask, are there limitations on what exactly the quad function can do when performing definite integral? Am I using quad in the correct application? How exactly does quad find an integral anyway?
import math
from scipy.integrate import quad
def g(λ,a,u,o):
return a*math.exp((λ-u)**2/(-2*o**2))
exc = quad(g, 4000, 8000, args=(1,6700,2.125))[0]
print(exc)
I have plotted this gaussian so I know that it is not zero within the range I have set. I have also plugged the integral in my scientific calculator and it spits out the answer of 5.33. So now I am at the conclusion that I have either made some mistake that I could not find or I am utilising quad in the wrong situation.
Any and all help is appreciated :)

Your function is basically 0 everywhere bar a small range, relative to the area you are trying to integrate over
You can add some points to help the function break the integration into smaller parts
points(sequence of floats,ints), optional A sequence of break points
in the bounded integration interval where local difficulties of the
integrand may occur (e.g., singularities, discontinuities). The
sequence does not have to be sorted. Note that this option cannot be
used in conjunction with weight.
import math
from scipy.integrate import quad
def g(λ,a,u,o):
return a*math.exp((λ-u)**2/(-2*o**2))
exc = quad(g, 4000, 8000, args=(1,6700,2.125), full_output=1, points=[6500, 7000])[0]
print(exc)
5.3265850835908095
There seems to be no way around this problem

As mentioned by Tom, the region where your function is significantly greater than 0 is too far out to be detected by the integration process. Theoretically, your u could also be 1e12, but it's asked a bit much by an integration scheme to find that.
One easy remedy is to increase the quadrature domain to [-inf, +inf], and shift the function such that the "interesting" part is around 0.
import math
from scipy.integrate import quad
import numpy as np
a = 1.0
u = 0.0
o = 2.125
def g(x):
return a * math.exp(-(x - u) ** 2 / (2 * o ** 2))
exc = quad(g, -np.inf, +np.inf)[0]
print(exc)
5.326585083590876

Related

How to deal with numerical integration in python with small results?

I am running into an issue with integration in Python returning incorrect values for an integral with a known analytical solution. The integral in question is
LaTex expression for the integral (can't post photos yet)
For the value of sigma I am using (1e-15),the solution to this integral has a value of ~ 1.25e-45. However when I use the scipy integrate package to calculate this I get zero, which I believe has to do with the precision required from the calculation.
#scipy method
import numpy as np
from scipy.integrate import quad
sigma = 1e-15
f = lambda x: (x**2) * np.exp(-x**2/(2*sigma**2))
#perform the integral and print the result
solution = quad(f,0,np.inf)[0]
print(solution)
0.0
And since precision was an issue I tried to also use another recommended package mpmath, which did not return 0, but was off by ~7 orders of magnitude from the correct answer. Testing larger values of sigma result in the solution being very close to the corresponding exact solution, but it seems to get increasingly incorrect as sigma gets smaller.
#mpmath method
import mpmath as mp
sigma = 1e-15
f = lambda x: (x**2) * mp.exp(-x**2/(2*sigma**2))
#perform the integral and print the result
solution = mp.quad(f,[0,np.inf])
print(solution)
2.01359486678988e-52
From here I could use some advice on getting a more accurate answer, as I would like to have some confidence applying python integration methods to integrals that cannot be solved analytically.
you should add extra points for the function as 'mid points', i added 100 points from 1e-100 to 1 to increase accuracy.
#mpmath method
import numpy as np
import mpmath as mp
sigma = 1e-15
f = lambda x: (x**2) * mp.exp(-x**2/(2*sigma**2))
#perform the integral and print the result
solution = mp.quad(f,[0,*np.logspace(-100,0,100),np.inf])
print(solution)
1.25286197427129e-45
Edit: turns out you need 10000 points instead of 100 points to get a more accurate result, of 1.25331413731554e-45, but it takes a few seconds to calculate.
Most numerical integrators will run into issues with numbers that small due to floating point precision. One solution is to scale the integral before calculating. Letting q -> x/sigma, the integral becomes:
f = lambda q: sigma**3*(q**2) * np.exp(-q**2/2)
solution = quad(f, 0, np.inf)[0]
# solution: 1.2533156529417088e-45

How to avoid singularity in numerical integration in Python

I am trying to compute the following integration in Python:
where the second term of the integrand is
I am currently computing it numerically by using Simpson's rule:
import math
import numpy as np
from scipy.integrate import simps
r = np.linspace(rmin, rmax, 5000)
f_val = some_complicated_function(r, params)
g_val = a*np.multiply(r**alpha, [math.exp(-b*r_) for r_ in r])
gamma = simps(np.multiply(f_val, g_val), r)
However, the result is not accurate for small r values. I checked the value of g_val and it was like below
array([2.48243025e-31, 1.62729999e-27, 3.31169129e-26, ...,
1.34177288e-13, 1.34053922e-13, 1.33930643e-13])
which is probably causing the underflow.
The most typical workaround would be to integrate the function analytically rather than numerically. However, the problem is the function f(r) is very complicated and it is not available as an explicit (analytic) function.
Does anyone know any idea to compute this kind of integration more accurately?

How to optimise the numerical evaluation of a SymPy integral?

I'm somewhat of a newbie to SymPy and was hoping someone could point out ways to optimise my code.
I need to numerically evaluate a somewhat involved expression with very high decimal places (150–300), and it is taking 30 seconds or longer per parameter set – which is very long given the parameter space to be calculated.
I have used lambdify with the mpmath backend and meijerg=True in the integral handling and it brought down run-times significantly. Are there any other methods that could be used? Ideally it would be great to push evaluation times below 1 second. My code is:
import mpmath
from mpmath import mpf, mp
mp.dps = 150 # ideally would like to have this set to 300
import numpy as np
from sympy import besselj, symbols, hankel2, legendre, sin, cos, tan, summation, I
from sympy import lambdify, expand, Integral
import time
x, alpha, k, m,n, r1, R, theta = symbols('x alpha k m n r1 R theta')
r1 = (R*cos(alpha))/cos(theta) #
Imn_part1 = (n*hankel2(n-1,k*r1)-(n+1)*hankel2(n+1,k*r1))*legendre(n, cos(theta))*cos(theta)
Imn_part2 = n*(n+1)*hankel2(n, k*r1)*(legendre(n-1, cos(theta)-legendre(n+1, cos(theta))))/k*r1
Imn_parts = expand(Imn_part1+Imn_part2)
Imn_expr = expand(Imn_parts*legendre(m,cos(theta))*(r1**2/R**2)*tan(theta))
Imn = Integral(Imn_expr, (theta, 0, alpha)).doit(meijerg=True)
# the lambdified expression
Imn_lambdify = lambdify([m,n,k,R,alpha], Imn,'mpmath')
When giving numerical inputs to the function – it takes a long time (30 s – 40 s).
substitute_dict = {'alpha':mpf(np.radians(10)), 'k':5,'R':mpf(0.1), 'm':20,'n':10}
print('starting calculation...')
start = time.time()
output = Imn_lambdify(substitute_dict['m'],
substitute_dict['n'],
substitute_dict['k'],
substitute_dict['R'],
substitute_dict['alpha'])
print(time.time()-start)
OS/package versions used:
Linux Mint 19.2
Python 3.8.5
SymPy 1.7.1
MPMath 1.2.1
Setting meijerg=True has just caused SymPy to not try as hard in evaluating the integral. It still can't evaluate it, but it has split it into 5 sub-integrals, which you can see if you print Imn. You might as well just leave it as one integral (leave off the doit()):
Imn = Integral(Imn_expr, (theta, 0, alpha))
For me, the split integral evaluates a little faster, but this is also about the same speed
Imn = Integral(simplify(Imn_expr), (theta, 0, alpha))
Ultimately, the thing that makes things slow is the number of digits that you are using. If you don't actually need these many digits, you shouldn't use them. Note that mpmath will automatically increase the precision internally to avoid cancellation, so it is unnecessary to do so yourself. I get the same value (with fewer digits) with the default dps of 15 as 150.
You can try substituting your values directly into your expression, if they do not change, and seeing if SymPy can simplify Imn_expr further with them.
As an aside, you are using np.radians(10), which a machine float, since that is what NumPy uses. This completely defeats the purpose of computing the final answer to 150 digits, since this input parameter is only accurate to 15. Consider using mpmath.pi/18 instead to get a value that is correct to the number of digits you specified.

Integrating a gaussian over a very long interval

I want to integrate a Gaussian function over a very large interval. I chose spicy.integrate.quad function for the integration. The function seems to work only when I select a small enough interval. When I use the codes below,
from scipy.integrate import quad
from math import pi, exp, sqrt
def func(x, mean, sigma):
return 1/(sqrt(2*pi)*sigma) * exp(-1/2*((x-mean)/sigma)**2)
print(quad(func, 0, 1e+31, args=(1e+29, 1e+28))[0]) # case 1
print(quad(func, 0, 1e+32, args=(1e+29, 1e+28))[0]) # case 2
print(quad(func, 0, 1e+33, args=(1e+29, 1e+28))[0]) # case 3
print(quad(func, 1e+25, 1e+33, args=(1e+29, 1e+28))[0]) # case 4
then the followings are printed.
1.0
1.0000000000000004
0.0
0.0
To obtain a reasonable result, I had to try and change the lower/upper bounds of the integral several times and empirically determine it to [0, 1e+32]. This seems risky to me, as when the mean and sigma of the gaussian function changes, then I always have to try different bounds.
Is there a clear way to integrate the function from 0 to 1e+50 without bothering with bounds? If not, how do you expect from beginning which bounds would give non-zero value?
In short, you can't.
On this long interval, the region where the gaussian is non-zero is tiny, and the adaptive procedure which works under the hood of integrate.quad fails to see it. And so would pretty much any adaptive routine, unless by chance.
Notice,
and the CDF of a normal random variable is known as ϕ(x) as it can not be expressed by an elementary function. So take ϕ((b-m)/s) - ϕ((a-m)/s). Also note that ϕ(x) = 1/2(1 + erf(x/sqrt(2))) so you need not call .quad to actually perform an integration and may have better luck with erf from scipy.
from scipy.special import erf
def prob(mu, sigma, a, b):
phi = lambda x: 1/2*(1 + erf((x - mu)/(sigma*np.sqrt(2))))
return phi(b) - phi(a)
This may give more accurate results (it does than the above)
>>> print(prob(0, 1e+31, 0, 1e+50))
0.5
>>> print(prob(0, 1e+32, 1e+28, 1e+29))
0.000359047985937333
>>> print(prob(0, 1e+33, 1e+28, 1e+29))
3.5904805169684195e-05
>>> print(prob(1e+25, 1e+33, 1e+28, 1e+29))
3.590480516979522e-05
and avoid the intense floating point error you are experiencing. However, the regions you integrate are so small in area that you may still see 0.

Integral of Intensity function in python

There is a function which determine the intensity of the Fraunhofer diffraction pattern of a circular aperture... (more information)
Integral of the function in distance x= [-3.8317 , 3.8317] must be about 83.8% ( If assume that I0 is 100) and when you increase the distance to [-13.33 , 13.33] it should be about 95%.
But when I use integral in python, the answer is wrong.. I don't know what's going wrong in my code :(
from scipy.integrate import quad
from scipy import special as sp
I0=100.0
dist=3.8317
I= quad(lambda x:( I0*((2*sp.j1(x)/x)**2)) , -dist, dist)[0]
print I
Result of the integral can't be bigger than 100 (I0) because this is the diffraction of I0 ... I don't know.. may be scaling... may be the method! :(
The problem seems to be in the function's behaviour near zero. If the function is plotted, it looks smooth:
However, scipy.integrate.quad complains about round-off errors, which is very strange with this beautiful curve. However, the function is not defined at 0 (of course, you are dividing by zero!), hence the integration does not go well.
You may use a simpler integration method or do something about your function. You may also be able to integrate it to very close to zero from both sides. However, with these numbers the integral does not look right when looking at your results.
However, I think I have a hunch of what your problem is. As far as I remember, the integral you have shown is actually the intensity (power/area) of Fraunhofer diffraction as a function of distance from the center. If you want to integrate the total power within some radius, you will have to do it in two dimensions.
By simple area integration rules you should multiply your function by 2 pi r before integrating (or x instead of r in your case). Then it becomes:
f = lambda(r): r*(sp.j1(r)/r)**2
or
f = lambda(r): sp.j1(r)**2/r
or even better:
f = lambda(r): r * (sp.j0(r) + sp.jn(2,r))
The last form is best as it does not suffer from any singularities. It is based on Jaime's comment to the original answer (see the comment below this answer!).
(Note that I omitted a couple of constants.) Now you can integrate it from zero to infinity (no negative radii):
fullpower = quad(f, 1e-9, np.inf)[0]
Then you can integrate from some other radius and normalize by the full intensity:
pwr = quad(f, 1e-9, 3.8317)[0] / fullpower
And you get 0.839 (which is quite close to 84 %). If you try the farther radius (13.33):
pwr = quad(f, 1e-9, 13.33)
which gives 0.954.
It should be noted that we introduce a small error by starting the integration from 1e-9 instead of 0. The magnitude of the error can be estimated by trying different values for the starting point. The integration result changes very little between 1e-9 and 1e-12, so they seem to be safe. Of course, you could use, e.g., 1e-30, but then there may be numerical instability in the division. (In this case there isn't, but in general singularities are numerically evil.)
Let us do one thing still:
import matplotlib.pyplot as plt
import numpy as np
x = linspace(0.01, 20, 1000)
intg = np.array([ quad(f, 1e-9, xx)[0] for xx in x])
plt.plot(x, intg/fullpower)
plt.grid('on')
plt.show()
And this is what we get:
At least this looks right, the dark fringes of the Airy disk are clearly visible.
What comes to the last part of the question: I0 defines the maximum intensity (the units may be, e.g. W/m2), whereas the integral gives total power (if the intensity is in W/m2, the total power is in W). Setting the maximum intensity to 100 does not guarantee anything about the total power. That is why it is important to calculate the total power.
There actually exists a closed form equation for the total power radiated onto a circular area:
P(x) = P0 ( 1 - J0(x)^2 - J1(x)^2 ),
where P0 is the total power.
Note that you also can get a closed form solution for your integration using Sympy:
import sympy as sy
sy.init_printing() # LaTeX like pretty printing in IPython
x,d = sy.symbols("x,d", real=True)
I0=100
dist=3.8317
f = I0*((2*sy.besselj(1,x)/x)**2) # the integrand
F = f.integrate((x, -d, d)) # symbolic integration
print(F.evalf(subs={d:dist})) # numeric evalution
F evaluates to:
1600*d*besselj(0, Abs(d))**2/3 + 1600*d*besselj(1, Abs(d))**2/3 - 800*besselj(1, Abs(d))**2/(3*d)
with besselj(0,r) corresponding to sp.j0(r).
They might be a singularity in the integration algorithm when doing the jacobian at x = 0. You can exclude this points from the integration with "points":
f = lambda x:( I0*((2*sp.j1(x)/x)**2))
I = quad(f, -dist, dist, points = [0])
I get then the following result (is this your desired result?)
331.4990321315221

Categories

Resources