How calculate a double integral accurately using python - python

I'm trying to calculate a double integral given by :
import scipy.special as sc
from numpy.lib.scimath import sqrt as csqrt
from scipy.integrate import dblquad
def g_re(alpha, beta, k, m):
psi = csqrt(alpha ** 2 + beta ** 2 - k ** 2)
return np.real(
sc.jv(m, alpha)
* sc.jv(m, beta)
* sc.jv(m, alpha)
* np.sin(beta)
* sc.jv(m, -1j * psi)
* np.exp(-psi)
/ (alpha ** 2 * psi)
)
def g_im(alpha, beta, k, m):
psi = csqrt(alpha ** 2 + beta ** 2 - k ** 2)
return np.imag(
sc.jv(m, alpha)
* sc.jv(m, beta)
* sc.jv(m, alpha)
* np.sin(beta)
* sc.jv(m, -1j * psi)
* np.exp(-psi)
/ (alpha ** 2 * psi)
)
k = 5
m = 0
tuple_args = (k, m)
ans = dblquad(g_re, 0.0, np.inf, 0, np.inf, args=tuple_args)[0]
ans += 1j * dblquad(g_im, 0.0, np.inf, 0, np.inf, args=tuple_args)[0]
The integration intervals are along the positive real axes ([0, np.inf[). When calculating I got the following warning :
/tmp/a.py:10: RuntimeWarning: invalid value encountered in multiply
sc.jv(m, alpha)
g/home/nschloe/.local/lib/python3.9/site-packages/scipy/integrate/quadpack.py:879: IntegrationWarning: The maximum number of subdivisions (50) has been achieved.
If increasing the limit yields no improvement it is advised to analyze
the integrand in order to determine the difficulties. If the position of a
local difficulty can be determined (singularity, discontinuity) one will
probably gain from splitting up the interval and calling the integrator
on the subranges. Perhaps a special-purpose integrator should be used.
quad_r = quad(f, low, high, args=args, full_output=self.full_output,
I subdivided the domain of integration but I still got the same warning. Could you help me please.

Related

how to pass two arrays into a fuction

I have a function that typically takes in constant args and calculates volatility. I want to pass in a vector of different C's and K's to get an array of volatilities each associated with C[i], K[i]
def vol_calc(S, T, C, K, r, q, sigma):
d1 = (np.log(S / K) + (r - q + 0.5 * sigma ** 2) * T) / (sigma * np.sqrt(T))
vega = (1 / np.sqrt(2 * np.pi)) * np.exp(-q * T) * np.sqrt(T) * np.exp((-si.norm.cdf(d1, 0.0, 1.0) ** 2) * 0.5)
tolerance = 0.000001
x0 = sigma
xnew = x0
xold = x0 - 1
while abs(xnew - xold) > tolerance:
xold = xnew
xnew = (xnew - fx - C) / vega
return abs(xnew)
but if I want to pass two arrays without turning into a nested loop, I thought I could just do:
def myfunction(S, T, r, q, sigma):
for x in K,C:
return same size as K,C
but I can't get it to work
How about this?
def vol_calc(S, T, C, K, r, q, sigma):
import numpy as np
output = np.zeros(len(C))
for num, (c, k) in enumerate(zip(C, K)):
d1 = (np.log(S / k) + (r - q + 0.5 * sigma ** 2) * T) / (sigma * np.sqrt(T))
vega = (1 / np.sqrt(2 * np.pi)) * np.exp(-q * T) * np.sqrt(T) * np.exp((-si.norm.cdf(d1, 0.0, 1.0) ** 2) * 0.5)
tolerance = 0.000001
x0 = sigma
xnew = x0
xold = x0 - 1
while abs(xnew - xold) > tolerance:
xold = xnew
xnew = (xnew - fx - c) / vega
output[num] = abs(xnew)
return output

How to use Gradient Descent to solve this multiple terms trigonometry function?

Question is like this:
f(x) = A sin(2π * L * x) + B cos(2π * M * x) + C sin(2π * N * x)
and L,M,N are constants integer, 0 <= L,M,N <= 100
and A,B,C can be any possible integers.
Here is the given data:
x = [0,0.01,0.02,0.03,0.04,0.05,0.06,0.07,0.08,0.09,0.1,0.11,0.12,0.13,0.14,0.15,0.16,0.17,0.18,0.19,0.2,0.21,0.22,0.23,0.24,0.25,0.26,0.27,0.28,0.29,0.3,0.31,0.32,0.33,0.34,0.35,0.36,0.37,0.38,0.39,0.4,0.41,0.42,0.43,0.44,0.45,0.46,0.47,0.48,0.49,0.5,0.51,0.52,0.53,0.54,0.55,0.56,0.57,0.58,0.59,0.6,0.61,0.62,0.63,0.64,0.65,0.66,0.67,0.68,0.69,0.7,0.71,0.72,0.73,0.74,0.75,0.76,0.77,0.78,0.79,0.8,0.81,0.82,0.83,0.84,0.85,0.86,0.87,0.88,0.89,0.9,0.91,0.92,0.93,0.94,0.95,0.96,0.97,0.98,0.99]
y = [4,1.240062433,-0.7829654986,-1.332487982,-0.3337640721,1.618033989,3.512512389,4.341307895,3.515268061,1.118929599,-2.097886967,-4.990538967,-6.450324073,-5.831575611,-3.211486891,0.6180339887,4.425660706,6.980842552,7.493970785,5.891593744,2.824429495,-0.5926374511,-3.207870455,-4.263694544,-3.667432785,-2,-0.2617162175,0.5445886005,-0.169441247,-2.323237059,-5.175570505,-7.59471091,-8.488730333,-7.23200463,-3.924327772,0.6180339887,5.138501587,8.38127157,9.532377045,8.495765687,5.902113033,2.849529206,0.4768388529,-0.46697525,0.106795821,1.618033989,3.071952496,3.475795162,2.255463709,-0.4905371745,-4,-7.117914956,-8.727599664,-8.178077181,-5.544088451,-1.618033989,2.365340134,5.169257268,5.995297102,4.758922924,2.097886967,-0.8873135564,-3.06024109,-3.678989552,-2.666365632,-0.6180339887,1.452191817,2.529722611,2.016594378,-0.01374122059,-2.824429495,-5.285215072,-6.302694708,-5.246870619,-2.210419738,2,6.13956874,8.965976562,9.68000641,8.201089581,5.175570505,1.716858387,-1.02183483,-2.278560533,-1.953524751,-0.6180339887,0.7393509358,1.129293593,-0.02181188158,-2.617913164,-5.902113033,-8.727381729,-9.987404016,-9.043589913,-5.984648344,-1.618033989,2.805900027,6.034770001,7.255101454,6.368389697]
enter image description here
How to use Gradient Descent to solve this multiple terms trigonometry function?
Gradient descent is not well suited for optimisation over integers. You can try a navie relaxation where you solve in floats, and hope the rounded solution is still ok.
from autograd import grad, numpy as jnp
import numpy as np
def cast(params):
[A, B, C, L, M, N] = params
L = jnp.minimum(jnp.abs(L), 100)
M = jnp.minimum(jnp.abs(M), 100)
N = jnp.minimum(jnp.abs(N), 100)
return A, B, C, L, M, N
def pred(params, x):
[A, B, C, L, M, N] = cast(params)
return A *jnp.sin(2 * jnp.pi * L * x) + B*jnp.cos(2*jnp.pi * M * x) + C * jnp.sin(2 * jnp.pi * N * x)
x = [0,0.01,0.02,0.03,0.04,0.05,0.06,0.07,0.08,0.09,0.1,0.11,0.12,0.13,0.14,0.15,0.16,0.17,0.18,0.19,0.2,0.21,0.22,0.23,0.24,0.25,0.26,0.27,0.28,0.29,0.3,0.31,0.32,0.33,0.34,0.35,0.36,0.37,0.38,0.39,0.4,0.41,0.42,0.43,0.44,0.45,0.46,0.47,0.48,0.49,0.5,0.51,0.52,0.53,0.54,0.55,0.56,0.57,0.58,0.59,0.6,0.61,0.62,0.63,0.64,0.65,0.66,0.67,0.68,0.69,0.7,0.71,0.72,0.73,0.74,0.75,0.76,0.77,0.78,0.79,0.8,0.81,0.82,0.83,0.84,0.85,0.86,0.87,0.88,0.89,0.9,0.91,0.92,0.93,0.94,0.95,0.96,0.97,0.98,0.99]
y = [4,1.240062433,-0.7829654986,-1.332487982,-0.3337640721,1.618033989,3.512512389,4.341307895,3.515268061,1.118929599,-2.097886967,-4.990538967,-6.450324073,-5.831575611,-3.211486891,0.6180339887,4.425660706,6.980842552,7.493970785,5.891593744,2.824429495,-0.5926374511,-3.207870455,-4.263694544,-3.667432785,-2,-0.2617162175,0.5445886005,-0.169441247,-2.323237059,-5.175570505,-7.59471091,-8.488730333,-7.23200463,-3.924327772,0.6180339887,5.138501587,8.38127157,9.532377045,8.495765687,5.902113033,2.849529206,0.4768388529,-0.46697525,0.106795821,1.618033989,3.071952496,3.475795162,2.255463709,-0.4905371745,-4,-7.117914956,-8.727599664,-8.178077181,-5.544088451,-1.618033989,2.365340134,5.169257268,5.995297102,4.758922924,2.097886967,-0.8873135564,-3.06024109,-3.678989552,-2.666365632,-0.6180339887,1.452191817,2.529722611,2.016594378,-0.01374122059,-2.824429495,-5.285215072,-6.302694708,-5.246870619,-2.210419738,2,6.13956874,8.965976562,9.68000641,8.201089581,5.175570505,1.716858387,-1.02183483,-2.278560533,-1.953524751,-0.6180339887,0.7393509358,1.129293593,-0.02181188158,-2.617913164,-5.902113033,-8.727381729,-9.987404016,-9.043589913,-5.984648344,-1.618033989,2.805900027,6.034770001,7.255101454,6.368389697]
def loss(params):
p = pred(params, np.array(x))
return jnp.mean((np.array(y)-p)**2)
params = np.array([np.random.random()*100 for _ in range(6)])
for _ in range(10000):
g = grad(loss)
params = params - 0.001*g(params)
print("Relaxed solution", cast(params), "loss=", loss(params))
constrained_params = np.round(cast(params))
print("Integer solution", constrained_params, "loss=", loss(constrained_params))
print()
Since the problem will have a lot of local minima, you might need to run it multiple times.
It's quite hard to use gradient descent to find a solution to this problem, because it tends to get stuck when changing the L, M, or N parameters. The gradients for those can push it away from the right solution, unless it is very close to an optimal solution already.
There are ways to get around this, such as basinhopping or random search, but because of the function you're trying to learn, you have a better alternative.
Since you're trying to learn a sinusoid function, you can use an FFT to find the frequencies of the sine waves. Once you have those frequencies, you can find the amplitudes and phases used to generate the same sine wave.
Pardon the messiness of this code, this is my first time using an FFT.
import scipy.fft
import numpy as np
import math
import matplotlib.pyplot as plt
def get_top_frequencies(x, y, num_freqs):
x = np.array(x)
y = np.array(y)
# Find timestep (assume constant timestep)
dt = abs(x[0] - x[-1]) / (len(x) - 1)
# Take discrete FFT of y
spectral = scipy.fft.fft(y)
freq = scipy.fft.fftfreq(y.shape[0], d=dt)
# Cut off top half of frequencies. Assumes input signal is real, and not complex.
spectral = spectral[:int(spectral.shape[0] / 2)]
# Double amplitudes to correct for cutting off top half.
spectral *= 2
# Adjust amplitude by sampling timestep
spectral *= dt
# Get ampitudes for all frequencies. This is taking the magnitude of the complex number
spectral_amplitude = np.abs(spectral)
# Pick frequencies with highest amplitudes
highest_idx = np.argsort(spectral_amplitude)[::-1][:num_freqs]
# Find amplitude, frequency, and phase components of each term
highest_amplitude = spectral_amplitude[highest_idx]
highest_freq = freq[highest_idx]
highest_phase = np.angle(spectral[highest_idx]) / math.pi
# Convert it into a Python function
function = ["def func(x):", "return ("]
for i, components in enumerate(zip(highest_amplitude, highest_freq, highest_phase)):
amplitude, freq, phase = components
plus_sign = " +" if i != (num_freqs - 1) else ""
term = f"{amplitude:.2f} * math.cos(2 * math.pi * {freq:.2f} * x + math.pi * {phase:.2f}){plus_sign}"
function.append(" " + term)
function.append(")")
return "\n ".join(function)
x = [0,0.01,0.02,0.03,0.04,0.05,0.06,0.07,0.08,0.09,0.1,0.11,0.12,0.13,0.14,0.15,0.16,0.17,0.18,0.19,0.2,0.21,0.22,0.23,0.24,0.25,0.26,0.27,0.28,0.29,0.3,0.31,0.32,0.33,0.34,0.35,0.36,0.37,0.38,0.39,0.4,0.41,0.42,0.43,0.44,0.45,0.46,0.47,0.48,0.49,0.5,0.51,0.52,0.53,0.54,0.55,0.56,0.57,0.58,0.59,0.6,0.61,0.62,0.63,0.64,0.65,0.66,0.67,0.68,0.69,0.7,0.71,0.72,0.73,0.74,0.75,0.76,0.77,0.78,0.79,0.8,0.81,0.82,0.83,0.84,0.85,0.86,0.87,0.88,0.89,0.9,0.91,0.92,0.93,0.94,0.95,0.96,0.97,0.98,0.99]
y = [4,1.240062433,-0.7829654986,-1.332487982,-0.3337640721,1.618033989,3.512512389,4.341307895,3.515268061,1.118929599,-2.097886967,-4.990538967,-6.450324073,-5.831575611,-3.211486891,0.6180339887,4.425660706,6.980842552,7.493970785,5.891593744,2.824429495,-0.5926374511,-3.207870455,-4.263694544,-3.667432785,-2,-0.2617162175,0.5445886005,-0.169441247,-2.323237059,-5.175570505,-7.59471091,-8.488730333,-7.23200463,-3.924327772,0.6180339887,5.138501587,8.38127157,9.532377045,8.495765687,5.902113033,2.849529206,0.4768388529,-0.46697525,0.106795821,1.618033989,3.071952496,3.475795162,2.255463709,-0.4905371745,-4,-7.117914956,-8.727599664,-8.178077181,-5.544088451,-1.618033989,2.365340134,5.169257268,5.995297102,4.758922924,2.097886967,-0.8873135564,-3.06024109,-3.678989552,-2.666365632,-0.6180339887,1.452191817,2.529722611,2.016594378,-0.01374122059,-2.824429495,-5.285215072,-6.302694708,-5.246870619,-2.210419738,2,6.13956874,8.965976562,9.68000641,8.201089581,5.175570505,1.716858387,-1.02183483,-2.278560533,-1.953524751,-0.6180339887,0.7393509358,1.129293593,-0.02181188158,-2.617913164,-5.902113033,-8.727381729,-9.987404016,-9.043589913,-5.984648344,-1.618033989,2.805900027,6.034770001,7.255101454,6.368389697]
print(get_top_frequencies(x, y, 3))
That produces this function:
def func(x):
return (
5.00 * math.cos(2 * math.pi * 10.00 * x + math.pi * 0.50) +
4.00 * math.cos(2 * math.pi * 5.00 * x + math.pi * -0.00) +
2.00 * math.cos(2 * math.pi * 3.00 * x + math.pi * -0.50)
)
Which is not quite the format you specified - you asked for two sins and one cos, and for no phase parameter. However, using the trigonometric identity cos(x) = sin(pi/2 - x), you can convert this into an equivalent expression that matches what you want:
def func(x):
return (
5.00 * math.sin(2 * math.pi * -10.00 * x) +
4.00 * math.cos(2 * math.pi * 5.00 * x) +
2.00 * math.sin(2 * math.pi * 3.00 * x)
)
And there's the original function!

overflow in exponential function while trying to integrate

I want to numerically integrate a discrete dataset (given ad pandas series) -here orange- which is multiplied with a given analytical exponential function (derivative of a Fermi-Dirac-Distribution) -here blue-. However I fail when the exponent becomes large (e.g. for small T) and thus the derivative fermi_dT(E, mu, T)explodes. I couldn't find a way to rewrite fermi_dT(E, mu, T)in an appropriate way to get it done.
Below is a minimal example (not with pandas series), where I simulated the dataset by a Gaussian.
If T<30. I'll get an overflow. Does anyone see a clever way to get around?
import numpy as np
from scipy import integrate
import matplotlib.pyplot as plt
scale_plot = 1e6
kB = 8.618292134831462e-5 #in eV
Ef = 2.0
def gaussian(E, amp, E0, sig):
return amp * np.exp(-(E-E0)**2 / sig)
def fermi_dT(E, mu, T):
return ((np.exp((E - mu) / (kB * T))*(E-mu)) / ((1 + np.exp((E - mu) / (kB * T)))**2*kB*T**2))
T = 100.0
energies = np.arange(1.,3.,0.001)
plt.plot(energies, (energies-Ef)*fermi_dT(energies, Ef, T))
plt.plot(energies, gaussian(energies, 1e-5, 1.8, .01))
plt.plot(energies, gaussian(energies, 1e-5, 1.8, .01)*(energies-Ef)*fermi_dT(energies, Ef, T)*scale_plot)
plt.show()
cum = integrate.cumtrapz(gaussian(energies, 1e-5, 1.8, .01)*(energies-Ef)*fermi_dT(energies, Ef, T), energies)
print(cum[-1])
This kind of numerical issue is quite usual when dealing with exponential derivatives. The trick is to compute first the log, and only after to apply the exponential:
log(a*exp(b) / (1 + c*exp(d)) ** k) = log(a) + b - k * log(1 + exp(log(c) + d)))
Now, you need to find a way to compute log(1 + exp(x)) accurately. Lucky for you, people have done it before, according to this post. So maybe you could rewrite fermi_dT using log1p:
import numpy as np
def softplus(x, limit=30):
val = np.empty_like(x)
val[x>=limit] = x[x>=limit]
val[x<limit] = np.log1p(np.exp(x[x<limit]))
return val
def fermi_dT(E, mu, T):
a = (E - mu) / (kB * T ** 2)
b = d = (E - mu) / (kB * T)
k = 2
val = np.empty_like(E)
val[E-mu>=0] = np.exp(np.log(a[E-mu>=0]) + b[E-mu>=0] - k * softplus(d[E-mu>=0]))
val[E-mu<0] = -np.exp(np.log(-a[E-mu<0]) + b[E-mu<0] - k * softplus(d[E-mu<0]))
return val

Error: TypeError: can't multiply sequence by non-int of type 'numpy.float64' <Figure size 432x288 with 0 Axes> (don't know what to do)

I'm trying to plot an equation that contains and definite integral. It is a photoionisation cross section associated with the intersubband transitions in a two-dimensional quantum ring. I made the angular part analytically, and I'm trying to calculate numerically the radial part.
Here's my attempt to implement this in a Python code:
from scipy.integrate import quad
import numpy as np
from scipy.special import gamma
from scipy.constants import alpha
import matplotlib.pyplot as plt
#Constants
epsilon = 13.1 #dielectric constant of the material
gamma_C = 0.5 # donor impurity linewidth
nr = 3.2 #refractive index of semiconductor
flux = 0 # Phi in eqn 8 magnetic flux
R = 5 #radius of the qunatum ring in nm
r = np.linspace(0, 6 * R)
rho = r / R
m_0 = 0.0067*0.511 # electron effective mass
h = 4.13e-15 # Planck constant in eV
hbar = 6.58e-16 # reduced Planck constant in eV
#Photon energy
hnu = np.linspace(0, 100) #in eV
#Function that calculates the integrand
def func(rho):
betai = np.sqrt( gama**4/4)
betaf = np.sqrt(1+gama**4/2)
return ((gama * rho)**(betai + betaf) *
np.exp(-1/2*(gama * rho)**2)
* (gama * rho)**2/2 )
def cross_section(hnu, gama):
#function that calculates the photoionisation cross section
betai = np.sqrt( gama**4/4)
betaf = np.sqrt(1+gama**4/2)
Ei = gama**2*(1+betai)-gama**4/2
Ef = gama**2*(3+betaf)-gama**4/2
return (nr/epsilon * 4*np.pi/3 * alpha * hnu *
(abs(R * np.sqrt(1/2**betai*gamma(betai + 1))*
np.sqrt(1/2**betaf*gamma(betaf + 1)) *
quad(func, 0, np.infty))**2 *
hbar * gamma_C/(Ef - Ei - hnu)**2 + ( hbar * gamma_C)**2))
#Plot
plt.figure();plt.clf()
for gama in [1.0, 1.5, 2.0]:
plt.plot(hnu, cross_section(hnu, gama))
But I keep receiving this error
TypeError: can't multiply sequence by non-int of type 'numpy.float64'
Anyone knows the cause and how can avoid this?
Take another look at the docstring for scipy.integrate.quad. In particular, look at the 'Returns' section. You'll see that it returns multiple values. More precisely, it returns a tuple of values. The actual number of values depends on the parameter full_output, but it always includes at least two values, the numerically computed integral and the error estimate.
In this code
return (nr/epsilon * 4*np.pi/3 * alpha * hnu *
(abs(R * np.sqrt(1/2**betai*gamma(betai + 1))*
np.sqrt(1/2**betaf*gamma(betaf + 1)) *
quad(func, 0, np.infty))**2 *
hbar * gamma_C/(Ef - Ei - hnu)**2 + ( hbar * gamma_C)**2))
you use the return value of quad, but that is a tuple, so it won't work correctly in that expression. To fix it, just pull out the first value of the tuple returned by quad. That is, replace quad(func, 0, np.infty) with quad(func, 0, np.infty)[0]:
return (nr/epsilon * 4*np.pi/3 * alpha * hnu *
(abs(R * np.sqrt(1/2**betai*gamma(betai + 1))*
np.sqrt(1/2**betaf*gamma(betaf + 1)) *
quad(func, 0, np.infty)[0])**2 *
hbar * gamma_C/(Ef - Ei - hnu)**2 + ( hbar * gamma_C)**2))

Fitting data with a custom distribution using scipy.stats

So I noticed that there is no implementation of the Skewed generalized t distribution in scipy. It would be useful for me to fit this is distribution to some data I have. Unfortunately fit doesn't seem to be working in this case for me. To explain further I have implemented it like so
import numpy as np
import pandas as pd
import scipy.stats as st
from scipy.special import beta
class sgt(st.rv_continuous):
def _pdf(self, x, mu, sigma, lam, p, q):
v = q ** (-1 / p) * \
((3 * lam ** 2 + 1) * (
beta(3 / p, q - 2 / p) / beta(1 / p, q)) - 4 * lam ** 2 *
(beta(2 / p, q - 1 / p) / beta(1 / p, q)) ** 2) ** (-1 / 2)
m = 2 * v * sigma * lam * q ** (1 / p) * beta(2 / p, q - 1 / p) / beta(
1 / p, q)
fx = p / (2 * v * sigma * q ** (1 / p) * beta(1 / p, q) * (
abs(x - mu + m) ** p / (q * (v * sigma) ** p) * (
lam * np.sign(x - mu + m) + 1) ** p + 1) ** (
1 / p + q))
return fx
def _argcheck(self, mu, sigma, lam, p, q):
s = sigma > 0
l = -1 < lam < 1
p_bool = p > 0
q_bool = q > 0
all_bool = s & l & p_bool & q_bool
return all_bool
This all works fine and I can generate random variables with given parameters no problem. The _argcheck is required as a simple positive params only check is not suitable.
sgt_inst = sgt(name='sgt')
vars = sgt_inst.rvs(mu=1, sigma=3, lam = -0.1, p = 2, q = 50, size = 100)
However, when I try fit these parameters I get an error
sgt_inst.fit(vars)
RuntimeWarning: invalid value encountered in subtract
numpy.max(numpy.abs(fsim[0] - fsim[1:])) <= fatol):
and it just returns
What I find strange is that when I implement the example custom Gaussian distribution as shown in the docs, it has no problem running the fit method.
Any ideas?
As fit docstring says,
Starting estimates for the fit are given by input arguments; for any arguments not provided with starting estimates, self._fitstart(data) is called to generate such.
Calling sgt_inst._fitstart(data) returns (1.0, 1.0, 1.0, 1.0, 1.0, 0, 1) (the first five are shape parameters, the last two are loc and scale). Looks like _fitstart is not a sophisticated process. The parameter l it picks does not meet your argcheck requirement.
Conclusion: provide your own starting parameters for fit, e.g.,
sgt_inst.fit(data, 0.5, 0.5, -0.5, 2, 10)
returns (1.4587093459289049, 5.471769032259468, -0.02391466905874927, 7.07289326147152
4, 0.741434497805832, -0.07012808188413872, 0.5308181287869771) for my random data.

Categories

Resources