I am trying to fit model data (calculated from eR) to my experimental data e_exp. I am not quite sure how to pass constants and variables to func.
import numpy as np
import math
from scipy.optimize import curve_fit, least_squares, minimize
f_exp = np.array([1, 1.6, 2.7, 4.4, 7.3, 12, 20, 32, 56, 88, 144, 250000])
e_exp = np.array([7.15, 7.30, 7.20, 7.25, 7.26, 7.28, 7.32, 7.25, 7.35, 7.34, 7.37, 13.55])
ezero = np.min(e_exp)
einf = np.max(e_exp)
ig_fc = 500
ig_alpha = 0.35
def CCER(einf, ezero, f_exp, fc, alpha):
x = [np.log(_ / ig_fc) for _ in f_exp]
eR = [ezero + 1/2 * (einf - ezero) * (1 + np.sinh((1 - ig_alpha) * _) / (np.cosh((1 - ig_alpha) * _) + np.sin(1/2 * ig_alpha * math.pi))) for _ in x]
return eR
def func(z):
return np.sum((CCER(z[0], z[1], z[2], z[3], z[4], z[5]) - e_exp) ** 2)
res = minimize(func, (ig_fc, ig_alpha), method='SLSQP')
einf, ezero, and f_exp are all constant plus the variables I need to optimize are ig_fc and ig_alpha, in which ig stands for initial guess.
How can I make this work?
I am also not sure which of the optimization algorithms from scipy are best suited for my problem (be it curve_fit, least_squares or minimize).
I believe what you want is the following:
def CCER(x, fc, alpha):
y = np.log(x/fc)
eR = ezero + 1/2 * (einf - ezero) * (1 + np.sinh((1 - alpha) * y) / (np.cosh((1 - alpha) * y) + np.sin(1/2 * alpha * math.pi)))
return eR
res = curve_fit(CCER, f_exp, e_exp, p0=(ig_fc, ig_alpha))
You're passing the first value to CCER as an argument, the two remaining ones (fc and alpha) are then treated as optimizable parameters. All fixed parameters will be read from the outer scope - no need to pass them explicitly to the function here.
Finally, in curve_fit you only need to pass an array of inputs (f_exp) and corresponding outputs (e_exp), as well as - possibly - a tuple of initial guesses p0.
Related
I'm trying to calculate a double integral given by :
import scipy.special as sc
from numpy.lib.scimath import sqrt as csqrt
from scipy.integrate import dblquad
def g_re(alpha, beta, k, m):
psi = csqrt(alpha ** 2 + beta ** 2 - k ** 2)
return np.real(
sc.jv(m, alpha)
* sc.jv(m, beta)
* sc.jv(m, alpha)
* np.sin(beta)
* sc.jv(m, -1j * psi)
* np.exp(-psi)
/ (alpha ** 2 * psi)
)
def g_im(alpha, beta, k, m):
psi = csqrt(alpha ** 2 + beta ** 2 - k ** 2)
return np.imag(
sc.jv(m, alpha)
* sc.jv(m, beta)
* sc.jv(m, alpha)
* np.sin(beta)
* sc.jv(m, -1j * psi)
* np.exp(-psi)
/ (alpha ** 2 * psi)
)
k = 5
m = 0
tuple_args = (k, m)
ans = dblquad(g_re, 0.0, np.inf, 0, np.inf, args=tuple_args)[0]
ans += 1j * dblquad(g_im, 0.0, np.inf, 0, np.inf, args=tuple_args)[0]
The integration intervals are along the positive real axes ([0, np.inf[). When calculating I got the following warning :
/tmp/a.py:10: RuntimeWarning: invalid value encountered in multiply
sc.jv(m, alpha)
g/home/nschloe/.local/lib/python3.9/site-packages/scipy/integrate/quadpack.py:879: IntegrationWarning: The maximum number of subdivisions (50) has been achieved.
If increasing the limit yields no improvement it is advised to analyze
the integrand in order to determine the difficulties. If the position of a
local difficulty can be determined (singularity, discontinuity) one will
probably gain from splitting up the interval and calling the integrator
on the subranges. Perhaps a special-purpose integrator should be used.
quad_r = quad(f, low, high, args=args, full_output=self.full_output,
I subdivided the domain of integration but I still got the same warning. Could you help me please.
normal pdf:
import numpy as np
import scipy
def gaussian(x, mu = 0, sigma = 1):
return 1/(np.sqrt(2*np.pi*sigma)) * np.exp(-(x-mu)**2 / (0.5*sigma)**2)
integrate over entire support:
scipy.integrate.quad(gaussian, -np.inf, np.inf)
returns
(0.3535533905932738, 1.4635936470160148e-11)
I know I messed up somewhere in the pdf but i've been starting at it for an hour and i can't see it
Your gaussian function is incorrect. It should be:
def gaussian(x, mu = 0, sigma = 1):
return (1/(sigma * np.sqrt(2 * np.pi))) * np.exp((-(x-mu)**2) / (2 * sigma ** 2))
I want to numerically integrate a discrete dataset (given ad pandas series) -here orange- which is multiplied with a given analytical exponential function (derivative of a Fermi-Dirac-Distribution) -here blue-. However I fail when the exponent becomes large (e.g. for small T) and thus the derivative fermi_dT(E, mu, T)explodes. I couldn't find a way to rewrite fermi_dT(E, mu, T)in an appropriate way to get it done.
Below is a minimal example (not with pandas series), where I simulated the dataset by a Gaussian.
If T<30. I'll get an overflow. Does anyone see a clever way to get around?
import numpy as np
from scipy import integrate
import matplotlib.pyplot as plt
scale_plot = 1e6
kB = 8.618292134831462e-5 #in eV
Ef = 2.0
def gaussian(E, amp, E0, sig):
return amp * np.exp(-(E-E0)**2 / sig)
def fermi_dT(E, mu, T):
return ((np.exp((E - mu) / (kB * T))*(E-mu)) / ((1 + np.exp((E - mu) / (kB * T)))**2*kB*T**2))
T = 100.0
energies = np.arange(1.,3.,0.001)
plt.plot(energies, (energies-Ef)*fermi_dT(energies, Ef, T))
plt.plot(energies, gaussian(energies, 1e-5, 1.8, .01))
plt.plot(energies, gaussian(energies, 1e-5, 1.8, .01)*(energies-Ef)*fermi_dT(energies, Ef, T)*scale_plot)
plt.show()
cum = integrate.cumtrapz(gaussian(energies, 1e-5, 1.8, .01)*(energies-Ef)*fermi_dT(energies, Ef, T), energies)
print(cum[-1])
This kind of numerical issue is quite usual when dealing with exponential derivatives. The trick is to compute first the log, and only after to apply the exponential:
log(a*exp(b) / (1 + c*exp(d)) ** k) = log(a) + b - k * log(1 + exp(log(c) + d)))
Now, you need to find a way to compute log(1 + exp(x)) accurately. Lucky for you, people have done it before, according to this post. So maybe you could rewrite fermi_dT using log1p:
import numpy as np
def softplus(x, limit=30):
val = np.empty_like(x)
val[x>=limit] = x[x>=limit]
val[x<limit] = np.log1p(np.exp(x[x<limit]))
return val
def fermi_dT(E, mu, T):
a = (E - mu) / (kB * T ** 2)
b = d = (E - mu) / (kB * T)
k = 2
val = np.empty_like(E)
val[E-mu>=0] = np.exp(np.log(a[E-mu>=0]) + b[E-mu>=0] - k * softplus(d[E-mu>=0]))
val[E-mu<0] = -np.exp(np.log(-a[E-mu<0]) + b[E-mu<0] - k * softplus(d[E-mu<0]))
return val
I'm trying to plot an equation that contains and definite integral. It is a photoionisation cross section associated with the intersubband transitions in a two-dimensional quantum ring. I made the angular part analytically, and I'm trying to calculate numerically the radial part.
Here's my attempt to implement this in a Python code:
from scipy.integrate import quad
import numpy as np
from scipy.special import gamma
from scipy.constants import alpha
import matplotlib.pyplot as plt
#Constants
epsilon = 13.1 #dielectric constant of the material
gamma_C = 0.5 # donor impurity linewidth
nr = 3.2 #refractive index of semiconductor
flux = 0 # Phi in eqn 8 magnetic flux
R = 5 #radius of the qunatum ring in nm
r = np.linspace(0, 6 * R)
rho = r / R
m_0 = 0.0067*0.511 # electron effective mass
h = 4.13e-15 # Planck constant in eV
hbar = 6.58e-16 # reduced Planck constant in eV
#Photon energy
hnu = np.linspace(0, 100) #in eV
#Function that calculates the integrand
def func(rho):
betai = np.sqrt( gama**4/4)
betaf = np.sqrt(1+gama**4/2)
return ((gama * rho)**(betai + betaf) *
np.exp(-1/2*(gama * rho)**2)
* (gama * rho)**2/2 )
def cross_section(hnu, gama):
#function that calculates the photoionisation cross section
betai = np.sqrt( gama**4/4)
betaf = np.sqrt(1+gama**4/2)
Ei = gama**2*(1+betai)-gama**4/2
Ef = gama**2*(3+betaf)-gama**4/2
return (nr/epsilon * 4*np.pi/3 * alpha * hnu *
(abs(R * np.sqrt(1/2**betai*gamma(betai + 1))*
np.sqrt(1/2**betaf*gamma(betaf + 1)) *
quad(func, 0, np.infty))**2 *
hbar * gamma_C/(Ef - Ei - hnu)**2 + ( hbar * gamma_C)**2))
#Plot
plt.figure();plt.clf()
for gama in [1.0, 1.5, 2.0]:
plt.plot(hnu, cross_section(hnu, gama))
But I keep receiving this error
TypeError: can't multiply sequence by non-int of type 'numpy.float64'
Anyone knows the cause and how can avoid this?
Take another look at the docstring for scipy.integrate.quad. In particular, look at the 'Returns' section. You'll see that it returns multiple values. More precisely, it returns a tuple of values. The actual number of values depends on the parameter full_output, but it always includes at least two values, the numerically computed integral and the error estimate.
In this code
return (nr/epsilon * 4*np.pi/3 * alpha * hnu *
(abs(R * np.sqrt(1/2**betai*gamma(betai + 1))*
np.sqrt(1/2**betaf*gamma(betaf + 1)) *
quad(func, 0, np.infty))**2 *
hbar * gamma_C/(Ef - Ei - hnu)**2 + ( hbar * gamma_C)**2))
you use the return value of quad, but that is a tuple, so it won't work correctly in that expression. To fix it, just pull out the first value of the tuple returned by quad. That is, replace quad(func, 0, np.infty) with quad(func, 0, np.infty)[0]:
return (nr/epsilon * 4*np.pi/3 * alpha * hnu *
(abs(R * np.sqrt(1/2**betai*gamma(betai + 1))*
np.sqrt(1/2**betaf*gamma(betaf + 1)) *
quad(func, 0, np.infty)[0])**2 *
hbar * gamma_C/(Ef - Ei - hnu)**2 + ( hbar * gamma_C)**2))
I am interested in doing a 2D numerical integration. Right now I am using the scipy.integrate.dblquad but it is very slow. Please see the code below. My need is to evaluate this integral 100s of times with completely different parameters. Hence I want to make the processing as fast and efficient as possible. The code is:
import numpy as np
from scipy import integrate
from scipy.special import erf
from scipy.special import j0
import time
q = np.linspace(0.03, 1.0, 1000)
start = time.time()
def f(q, z, t):
return t * 0.5 * (erf((t - z) / 3) - 1) * j0(q * t) * (1 / (np.sqrt(2 * np.pi) * 2)) * np.exp(
-0.5 * ((z - 40) / 2) ** 2)
y = np.empty([len(q)])
for n in range(len(q)):
y[n] = integrate.dblquad(lambda t, z: f(q[n], z, t), 0, 50, lambda z: 10, lambda z: 60)[0]
end = time.time()
print(end - start)
Time taken is
212.96751403808594
This is too much. Please suggest a better way to achieve what I want to do. I tried to do some search before coming here, but didn't find any solution. I have read quadpy can do this job better and very faster but I have no idea how to implement the same. Please help.
You could use Numba or a low-level-callable
Almost your example
I simply pass function directly to scipy.integrate.dblquad instead of your method using lambdas to generate functions.
import numpy as np
from scipy import integrate
from scipy.special import erf
from scipy.special import j0
import time
q = np.linspace(0.03, 1.0, 1000)
start = time.time()
def f(t, z, q):
return t * 0.5 * (erf((t - z) / 3) - 1) * j0(q * t) * (1 / (np.sqrt(2 * np.pi) * 2)) * np.exp(
-0.5 * ((z - 40) / 2) ** 2)
def lower_inner(z):
return 10.
def upper_inner(z):
return 60.
y = np.empty(len(q))
for n in range(len(q)):
y[n] = integrate.dblquad(f, 0, 50, lower_inner, upper_inner,args=(q[n],))[0]
end = time.time()
print(end - start)
#143.73969149589539
This is already a tiny bit faster (143 vs. 151s) but the only use is to have a simple example to optimize.
Simply compiling the functions using Numba
To get this to run you need additionally Numba and numba-scipy. The purpose of numba-scipy is to provide wrapped functions from scipy.special.
import numpy as np
from scipy import integrate
from scipy.special import erf
from scipy.special import j0
import time
import numba as nb
q = np.linspace(0.03, 1.0, 1000)
start = time.time()
#error_model="numpy" -> Don't check for division by zero
#nb.njit(error_model="numpy",fastmath=True)
def f(t, z, q):
return t * 0.5 * (erf((t - z) / 3) - 1) * j0(q * t) * (1 / (np.sqrt(2 * np.pi) * 2)) * np.exp(
-0.5 * ((z - 40) / 2) ** 2)
def lower_inner(z):
return 10.
def upper_inner(z):
return 60.
y = np.empty(len(q))
for n in range(len(q)):
y[n] = integrate.dblquad(f, 0, 50, lower_inner, upper_inner,args=(q[n],))[0]
end = time.time()
print(end - start)
#8.636585235595703
Using a low level callable
The scipy.integrate functions also provide the possibility to pass C-callback function instead of a Python function. These functions can be written for example in C, Cython or Numba, which I use in this example. The main advantage is, that no Python interpreter interaction is necessary on function call.
An excellent answer of #Jacques Gaudin shows an easy way to do this including additional arguments.
import numpy as np
from scipy import integrate
from scipy.special import erf
from scipy.special import j0
import time
import numba as nb
from numba import cfunc
from numba.types import intc, CPointer, float64
from scipy import LowLevelCallable
q = np.linspace(0.03, 1.0, 1000)
start = time.time()
def jit_integrand_function(integrand_function):
jitted_function = nb.njit(integrand_function, nopython=True)
#error_model="numpy" -> Don't check for division by zero
#cfunc(float64(intc, CPointer(float64)),error_model="numpy",fastmath=True)
def wrapped(n, xx):
ar = nb.carray(xx, n)
return jitted_function(ar[0], ar[1], ar[2])
return LowLevelCallable(wrapped.ctypes)
#jit_integrand_function
def f(t, z, q):
return t * 0.5 * (erf((t - z) / 3) - 1) * j0(q * t) * (1 / (np.sqrt(2 * np.pi) * 2)) * np.exp(
-0.5 * ((z - 40) / 2) ** 2)
def lower_inner(z):
return 10.
def upper_inner(z):
return 60.
y = np.empty(len(q))
for n in range(len(q)):
y[n] = integrate.dblquad(f, 0, 50, lower_inner, upper_inner,args=(q[n],))[0]
end = time.time()
print(end - start)
#3.2645838260650635
Generally it is much, much faster to do a summation via matrix operations than to use scipy.integrate.quad (or dblquad). You could rewrite your f(q, z, t) to take in a q, z and t vector and return a 3D-array of f-values using np.tensordot, then multiply your area element (dtdz) with the function values and sum them using np.sum. If your area element is not constant, you have to make an array of area-elements and use np.einsum To take your integration limits into account you can use a masked array to mask the function values outside your integration limits before summarizing. Take note that np.einsum overlooks the masks, so if you use einsum you can use np.where to set function values outside your integration limits to zero. Example (with constant area element and simple integration limits):
import numpy as np
import scipy.special as ss
import time
def f(q, t, z):
# Making 3D arrays before computation for readability. You can save some time by
# Using tensordot directly when computing the output
Mq = np.tensordot(q, np.ones((len(t), len(z))), axes=0)
Mt = np.tensordot(np.ones(len(q)), np.tensordot(t, np.ones(len(z)), axes = 0), axes = 0)
Mz = np.tensordot(np.ones((len(q), len(t))), z, axes = 0)
return Mt * 0.5 * (ss.erf((Mt - Mz) / 3) - 1) * (Mq * Mt) * (1 / (np.sqrt(2 * np.pi) * 2)) * np.exp(
-0.5 * ((Mz - 40) / 2) ** 2)
q = np.linspace(0.03, 1, 1000)
t = np.linspace(0, 50, 250)
z = np.linspace(10, 60, 250)
#if you have constand dA you can shave some time by computing dA without using np.diff
#if dA is variable, you have to make an array of dA values and np.einsum instead of np.sum
t0 = time.process_time()
dA = np.diff(t)[0] * np.diff(z)[0]
func_vals = f(q, t, z)
I = np.sum(func_vals * dA, axis=(1, 2))
t1 = time.process_time()
this took 18.5s on my 2012 macbook pro (2.5GHz i5) with dA = 0.04. Doing things this way also allows you to easily choose between precision and efficiency, and to set dA to a value that makes sense when you know how your function behaves.
However, it is worth noting that if you want a larger amount of points, you have to split up your integral, or else you risk maxing out your memory (1000 x 1000 x 1000) doubles requires 8GB of ram. So if you are doing very big integrations with high presicion it can be worth doing a quick check on the memory required before running.