How to avoid singularity in numerical integration in Python - python

I am trying to compute the following integration in Python:
where the second term of the integrand is
I am currently computing it numerically by using Simpson's rule:
import math
import numpy as np
from scipy.integrate import simps
r = np.linspace(rmin, rmax, 5000)
f_val = some_complicated_function(r, params)
g_val = a*np.multiply(r**alpha, [math.exp(-b*r_) for r_ in r])
gamma = simps(np.multiply(f_val, g_val), r)
However, the result is not accurate for small r values. I checked the value of g_val and it was like below
array([2.48243025e-31, 1.62729999e-27, 3.31169129e-26, ...,
1.34177288e-13, 1.34053922e-13, 1.33930643e-13])
which is probably causing the underflow.
The most typical workaround would be to integrate the function analytically rather than numerically. However, the problem is the function f(r) is very complicated and it is not available as an explicit (analytic) function.
Does anyone know any idea to compute this kind of integration more accurately?

Related

How to deal with numerical integration in python with small results?

I am running into an issue with integration in Python returning incorrect values for an integral with a known analytical solution. The integral in question is
LaTex expression for the integral (can't post photos yet)
For the value of sigma I am using (1e-15),the solution to this integral has a value of ~ 1.25e-45. However when I use the scipy integrate package to calculate this I get zero, which I believe has to do with the precision required from the calculation.
#scipy method
import numpy as np
from scipy.integrate import quad
sigma = 1e-15
f = lambda x: (x**2) * np.exp(-x**2/(2*sigma**2))
#perform the integral and print the result
solution = quad(f,0,np.inf)[0]
print(solution)
0.0
And since precision was an issue I tried to also use another recommended package mpmath, which did not return 0, but was off by ~7 orders of magnitude from the correct answer. Testing larger values of sigma result in the solution being very close to the corresponding exact solution, but it seems to get increasingly incorrect as sigma gets smaller.
#mpmath method
import mpmath as mp
sigma = 1e-15
f = lambda x: (x**2) * mp.exp(-x**2/(2*sigma**2))
#perform the integral and print the result
solution = mp.quad(f,[0,np.inf])
print(solution)
2.01359486678988e-52
From here I could use some advice on getting a more accurate answer, as I would like to have some confidence applying python integration methods to integrals that cannot be solved analytically.
you should add extra points for the function as 'mid points', i added 100 points from 1e-100 to 1 to increase accuracy.
#mpmath method
import numpy as np
import mpmath as mp
sigma = 1e-15
f = lambda x: (x**2) * mp.exp(-x**2/(2*sigma**2))
#perform the integral and print the result
solution = mp.quad(f,[0,*np.logspace(-100,0,100),np.inf])
print(solution)
1.25286197427129e-45
Edit: turns out you need 10000 points instead of 100 points to get a more accurate result, of 1.25331413731554e-45, but it takes a few seconds to calculate.
Most numerical integrators will run into issues with numbers that small due to floating point precision. One solution is to scale the integral before calculating. Letting q -> x/sigma, the integral becomes:
f = lambda q: sigma**3*(q**2) * np.exp(-q**2/2)
solution = quad(f, 0, np.inf)[0]
# solution: 1.2533156529417088e-45

Are there any inherent limitations to the scipy.integrate.quad function?

I am currently attempting to perform a definite integral of a gaussian function and I am receiving an answer of 0 when I am convinced that is not the case.
This leads me to ask, are there limitations on what exactly the quad function can do when performing definite integral? Am I using quad in the correct application? How exactly does quad find an integral anyway?
import math
from scipy.integrate import quad
def g(λ,a,u,o):
return a*math.exp((λ-u)**2/(-2*o**2))
exc = quad(g, 4000, 8000, args=(1,6700,2.125))[0]
print(exc)
I have plotted this gaussian so I know that it is not zero within the range I have set. I have also plugged the integral in my scientific calculator and it spits out the answer of 5.33. So now I am at the conclusion that I have either made some mistake that I could not find or I am utilising quad in the wrong situation.
Any and all help is appreciated :)
Your function is basically 0 everywhere bar a small range, relative to the area you are trying to integrate over
You can add some points to help the function break the integration into smaller parts
points(sequence of floats,ints), optional A sequence of break points
in the bounded integration interval where local difficulties of the
integrand may occur (e.g., singularities, discontinuities). The
sequence does not have to be sorted. Note that this option cannot be
used in conjunction with weight.
import math
from scipy.integrate import quad
def g(λ,a,u,o):
return a*math.exp((λ-u)**2/(-2*o**2))
exc = quad(g, 4000, 8000, args=(1,6700,2.125), full_output=1, points=[6500, 7000])[0]
print(exc)
5.3265850835908095
There seems to be no way around this problem
As mentioned by Tom, the region where your function is significantly greater than 0 is too far out to be detected by the integration process. Theoretically, your u could also be 1e12, but it's asked a bit much by an integration scheme to find that.
One easy remedy is to increase the quadrature domain to [-inf, +inf], and shift the function such that the "interesting" part is around 0.
import math
from scipy.integrate import quad
import numpy as np
a = 1.0
u = 0.0
o = 2.125
def g(x):
return a * math.exp(-(x - u) ** 2 / (2 * o ** 2))
exc = quad(g, -np.inf, +np.inf)[0]
print(exc)
5.326585083590876

Sympy TypeError when using scipy.stats normal cdf

Why does Sympy throw a Type error when I use the scipy.stats.norm? How can I solve the equation?
from sympy import Eq, Symbol, solve, Piecewise
from scipy.stats import norm
import numpy as np
x = Symbol('x')
eqn = Eq((x-0.2)/0.3, norm.cdf((np.log(100/110) + x**2/2)/x))
print(solve(eqn))
Output:
TypeError: cannot determine truth value of Relational
Symbolic setup
If you are looking for symbolic solutions, use symbolic functions: e.g., SymPy's log not NumPy's log. The normal CDF is also available from SymPy's stats module as cdf(Normal("x", 0, 1)). The correct SymPy setup would be this:
from sympy import Eq, Rational, Symbol, log
from sympy.stats import cdf, Normal
eqn = Eq((x-Rational('0.2'))/Rational('0.3'), cdf(Normal("x", 0, 1))(log(Rational(100, 110)) + x**2/2)/x)
Notice that I put Rational('0.2') where you had 0.2. The distinction between rationals and floats is important for symbolic math. The equation now looks good from the formal point of view:
Eq(10*x/3 - 2/3, (erf(sqrt(2)*(x**2/2 - log(11) + log(10))/2)/2 + 1/2)/x)
Unfortunately it also looks hopeless: there's no closed form solution for things like that, involving a transcendental function equated to a polynomial. Naturally, solve(eqn) will fail. So all of the above does is demonstrate correct use of SymPy, but it doesn't change the fact that there is no symbolic solution.
Numeric solution
To solve this numerically, do the opposite: drop the SymPy parts and import fsolve from SciPy.
from scipy.stats import norm
from scipy.optimize import fsolve
import numpy as np
f = lambda x: (x-0.2)/0.3 - norm.cdf((np.log(100/110) + x**2/2)/x)
print(fsolve(f, 1)) # 1 is a random initial guess
The answer is 0.33622392.

Error in ODE Solver in Python

I'm working with a Spark Combustion Engine Model and because some reasons I'm using python to model the combustion. I'm trying to use the solver of ODEs but the yield is completly out of reality. I discovered that the integration of Volume of cylinder is wrong. I have already tried use the "odeint" and "ode" solver but the result is the same.
The code shows the derivative of Volume with theta and integrate to find the volume. I put the analytical equation to compare.
OBS: I had a similar problem using Matlab, but was when I tried use degrees in trigonometric functions. When I changed for radians the problem was solved.
The code follows:
from scipy.integrate import odeint
from scipy.integrate import ode
from scipy import integrate
import math
import sympy
from sympy import sqrt, sin, cos, tan, atan
from pylab import *
from RatesComp import *
V_real=np.zeros((100))
def Volume(V,theta):
V_sol = V[0]
dVdtheta = Vtdc*(r-1)/2 *( sin(theta) + eps/2*sin(2*theta)/sqrt(1-(eps**2)*sin(theta)**2))
return [dVdtheta]
#Geometry
eps = 0.25; # half stroke to rod ratio, s/2l
r = 10; # compression ratio
Vtdc = 6.9813e-05 # volume at TDC
# Initial Conditions
theta0 = - pi
V_init = 0.0006283
theta = linspace(-pi,pi,100)
solve = odeint( Volume, V_init, theta)
# Analytical Result
Size = len(theta)
for i in range(0, Size,1):
V_real[i] = Vtdc*(1+(r-1)/2*(1-cos(theta[i])+ 1/eps*(1-(1-(eps**2)*sin(theta[i])**2)**0.5)))
figure(1)
plot(theta, solve[:,0],label="Comput")
plot(theta, V_real[0:Size],label="Real")
ylabel('Volume [m^3]')
xlabel('CA [Rad]')
legend()
grid(True)
show()
The fig that I show is the volume of cylinder. The result real and the compute
Can someone help with information about why this problem happens?
Apparently you use python2. There the declaration of r=10 gives r the type integer which leads to a unwanted integer division in (r-1)/2 in the 'real' solution. In the derivative function there is a float value Vtdc as first factor in the product, after which the whole product evaluation is in float.
Thus change to r=10.0 or use (r-1.0)/2 or 0.5*(r-1).
And you should also set V_init = r*Vtdc as that is the value of V_real(-pi).
If you use python2 add at the first line: from __future__ import division to use division from Python3 according to documentation: https://mail.python.org/pipermail/tutor/2008-March/060886.html
In python2 when you divide two integer values you will get an integer result not float. It is may be solve your problem without large changing in the code.

Calculating 1/r*d/dr(r*f) numerically in python when r=0. f is a function of r

Usually when you do this by hand there's no problem as the 1/r usually gets cancelled with another r. But doing this numerically with scipy.misc.derivative works like a charm for r different from zero. But of course, as soon as I ask for r = 0, I get division by zero, which I expected. So how else could you calculate this numerically. I insist on the fact that everything has to be done numerically as my function are now so complicated that I won't be able to find a derivative manually. Thank you!
My code:
rAtheta = lambda _r: _r*Atheta(_r,theta,z,t)
if r != 0:
return derivative(rAtheta,r,dx=1e-10,order=3)/r
else:
#What should go here so that it doesn't blow up when calculating the gradient?
tl;dr: use symbolic differentiation, or complex step differentiation if that fails
If you insist on using numerical methods, you really have to approximate the limit of the derivative as r->0 one way or the other.
I suggest trying complex step differentiation. The idea is to use complex arguments inside the function you're trying to differentiate, but it usually gets rid of the numerical instability that is imposed by standard finite difference schemes. The result is a procedure that needs complex arithmetic (hooray numpy, and python in general!) but in turn can be much more stable at small dx values.
Here's another point: complex step differentiation uses
F′(x0) = Im(F(x0+ih))/h + O(h^2)
Let's apply this to your r=0 case:
F′(0) = Im(F(ih))/h + O(h^2)
There are no singularities even for r=0! Choose a small h, possibly the same dx you're passing to your function, and use that:
def rAtheta(_r):
# note that named lambdas are usually frowned upon
return _r*Atheta(_r,theta,z,t)
tol = 1e-10
dr = 1e-12
if np.abs(r) > tol: # or math.abs or your favourite other abs
return derivative(rAtheta,r,dx=dr,order=3)/r
else:
return rAtheta(r + 1j*dr).imag/dr/r
Here is the above in action for f = r*ln(r):
The result is straightforwardly smooth, even though the points below r=1e-10 were computed with complex step differentiation.
Very important note: notice the separation between tol and dr in the code. The former is used to determine when to switch between methods, and the latter is used as a step in complex step differentiation. Look what happens when tol=dr=1e-10:
the result is a smoothly wrong function below r=1e-10! That's why you always have to be careful with numerical differentiation. And I wouldn't advise going too much below that in dr, as machine precision will bite you sooner or later.
But why stop here? I'm fairly certain that your functions could be written in a vectorized way, i.e. they could accept an array of radial points. Using complex step differentiation you don't have to loop over the radial points (which you would have to resort to using scipy.misc.derivative). Example:
import numpy as np
import matplotlib.pyplot as plt
def Atheta(r,*args):
return r*np.log(r) # <-- vectorized expression
def rAtheta(r):
return r*Atheta(r) #,theta,z,t) # <-- vectorized as much as Atheta is
def vectorized_difffun(rlist):
r = np.asarray(rlist)
dr = 1e-12
return (rAtheta(r + 1j*dr)).imag/dr/r
rarr = np.logspace(-12,-2,20)
darr = vectorized_difffun(rarr)
plt.figure()
plt.loglog(rarr,np.abs(darr),'.-')
plt.xlabel(r'$r$')
plt.ylabel(r'$|\frac{1}{r} \frac{d}{dr}(r^2 \ln r)|$')
plt.tight_layout()
plt.show()
The result should be familiar:
Having cleared the fun weirdness that is complex step differentiation, I should note that you should strongly consider using symbolic math. In cases like this when 1/r factors disappear exactly, it wouldn't hurt if you reached this conclusion exactly. After all double precision is still just double precision.
For this you'll need the sympy module, define your function symbolically once, differentiate it symbolically once, turn your simplified result into a numpy function using sympy.lambdify, and use this numerical function as much as you need (assuming that this whole process runs in finite time and the resulting function is not too slow to use). Example:
import sympy as sym
# only needed for the example:
import numpy as np
import matplotlib.pyplot as plt
r = sym.symbols('r')
f = r*sym.ln(r)
df = sym.diff(r*f,r)
res_sym = sym.simplify(df/r)
res_num = sym.lambdify(r,res_sym,'numpy')
rarr = np.logspace(-12,-2,20)
darr = res_num(rarr)
plt.figure()
plt.loglog(rarr,np.abs(darr),'.-')
plt.xlabel(r'$r$')
plt.ylabel(r'$|\frac{1}{r} \frac{d}{dr}(r^2 \ln r)|$')
plt.tight_layout()
plt.show()
resulting in
As you see, the resulting function was vectorized thanks to lambdify using numpy during the conversion from symbolic to numeric function. Obviously, the best solution is the symbolic one as long as the resulting function is not so complicated to make its practical use impossible. I urge you to first try the symbolic version, and if for some reason it's not applicable, switch to complex step differentiation, with due caution.

Categories

Resources