Computing definite integrals in python - python

I'm trying to write a loop that calculates the value of a definite integral at each step. The function bigF is very complicated. To put it in simple terms, it integrates a bunch of terms with respect to s, from s=tn-(n/2) to s=tn+(n/2). After the integration, bigF still has a variable t. So you can say bigF(t) = integral(f(s,t)), where f(s,t) is the big mess of terms after integrate.integ. In the last line, I want to evaluate bigF(t) at t=tn after bigF computes the integral of f(s,t)
After running, I get the error global name 's' is not defined. But s was meant to be just a dummy variable in the integration, since I am computing a convolution. What do I need to do?
import numpy as np
import scipy.integrate as integ
import math
nt=5001#; %since (50-0)/.01 = 5000
dt = .01#; % =H
H=.01
theta_n = np.ones(nt)
theta_n[1]=0#; %theta_o
omega_n = np.ones(nt)
omega_n[1]=-0.4# %omega_o
epsilon=10^(-6)
eta = epsilon*10
t_o=0
def bigF(t, n):
return integrate.integ((422.11/eta)*math.exp((5*(4*((eta*t-s-tn)^2)/eta^2)-1)^(-1))*omega, s,tn-(n/2),tn+(n/2))
for n in range(1,4999)
tn=t_o+n*dt;
theta_n[n+1] = theta_n[n] + H*bigF(tn, n);

If you're doing a convolution, sounds like you want numpy.convolve.

Related

is there a way to retrieve the nodes automatically computed by mpmath.quad integration routine?

I am trying to calculate an integral with mpmath.quad. I basically have to calculate three moments of a distribution, call it f (pseudocode):
integrate(f(x)/x, 0, infinity)
integrate(f(x)*x, 0, infinity)
integrate(f(x)ln(x)/x, 0, infinity)
As far as I understood, the tanh-sinh algorithm from mpmath applies a coordinate transformation to the integration interval, uses it to find suitable nodes xk-s and weights wk-s and returns (pseudocode):
sum f(xk)*wk
with N the number of nodes. Since in my original problem the calculation takes a long time, I was wishing to reuse the values of f calculated by the first integration for the other integrals, i.e., computing them from discrete samples with something like scipy.integrate.trapezoid or simpson. By employing a structure Simulator I simplified from an answer on this forum, I managed to cache the xk-s and f(xk)-s, but not the weights. I checked that the xk-s from different integrations are the same.
Now, if I naively apply a standard quadrature from the scipy module to my samples, say trapezoid(fs, xs), the result I get is different from that calculated by mpmath.quad, especially if the samples are few. While this fact does not surprise me, I would like to find out how to retrieve the weights mpmath.quad uses in the first calculation, say the one for f(x)/x, so I could avoid running the time consuming algorithm thrice.
I cannot understand the documentation as for this point. mpmath documentation gives a lot of examples, but none concerning the retrieval of calculated nodes, although it states several times the nodes are "cached". Where are they? mpmath.quad only returns the integral result!
So I would like to know: is what I'm trying to achieve sensible at all? And if it is, how could I accomplish the task?
Below is a code that reproduces the behaviour. Any help is very appreciated.
import numpy as np
import mpmath as mp
import matplotlib.pyplot as plt
from scipy.integrate import trapezoid, simpson
class Simulator:
def __init__(self, func, storex:np.ndarray=np.array([]), storef:np.ndarray=np.array([])):
self.func = func
self.storex = storex
self.storef = storef
def simulate(self, x, *args):
result = self.func(x, *args)
self.storex = np.append(self.storex, x)
self.storef = np.append(self.storef, result)
return result
def lorentz(x):
x = mp.mpf(x)
return mp.mpf(1)/(mp.power(x-mp.mpf(1), 2) + mp.mpf(1))
simratio = simproduct = Simulator(lorentz)
integralratio = mp.quad(lambda x: simratio.simulate(x)/x, [0, 1, mp.inf])
integralproduct = mp.quad(lambda x: simproduct.simulate(x)*x, [0, 1, mp.inf])
ratiox, ratioy = np.transpose(sorted([(x,y) for x, y in zip(simratio.storex, simratio.storef)])) #we get the sampled xs and f(x)s
integralproduct_trap = trapezoid(ratioy*ratiox, ratiox)
print("int[f(x)/x], quad: ", integralratio)
print("int[f(x)*x], quad: ", integralproduct)
print("int[f(x)*x], scipy.trapezoid: ", integralproduct_trap)

How to avoid multiple calls to a slow function when using scipy integration with complex valued functions? [duplicate]

I'm using right now the scipy.integrate.quad to successfully integrate some real integrands. Now a situation appeared that I need to integrate a complex integrand. quad seems not be able to do it, as the other scipy.integrate routines, so I ask: is there any way to integrate a complex integrand using scipy.integrate, without having to separate the integral in the real and the imaginary parts?
What's wrong with just separating it out into real and imaginary parts? scipy.integrate.quad requires the integrated function return floats (aka real numbers) for the algorithm it uses.
import scipy
from scipy.integrate import quad
def complex_quadrature(func, a, b, **kwargs):
def real_func(x):
return scipy.real(func(x))
def imag_func(x):
return scipy.imag(func(x))
real_integral = quad(real_func, a, b, **kwargs)
imag_integral = quad(imag_func, a, b, **kwargs)
return (real_integral[0] + 1j*imag_integral[0], real_integral[1:], imag_integral[1:])
E.g.,
>>> complex_quadrature(lambda x: (scipy.exp(1j*x)), 0,scipy.pi/2)
((0.99999999999999989+0.99999999999999989j),
(1.1102230246251564e-14,),
(1.1102230246251564e-14,))
which is what you expect to rounding error - integral of exp(i x) from 0, pi/2 is (1/i)(e^i pi/2 - e^0) = -i(i - 1) = 1 + i ~ (0.99999999999999989+0.99999999999999989j).
And for the record in case it isn't 100% clear to everyone, integration is a linear functional, meaning that ∫ { f(x) + k g(x) } dx = ∫ f(x) dx + k ∫ g(x) dx (where k is a constant with respect to x). Or for our specific case ∫ z(x) dx = ∫ Re z(x) dx + i ∫ Im z(x) dx as z(x) = Re z(x) + i Im z(x).
If you are trying to do a integration over a path in the complex plane (other than along the real axis) or region in the complex plane, you'll need a more sophisticated algorithm.
Note: Scipy.integrate will not directly handle complex integration. Why? It does the heavy lifting in the FORTRAN QUADPACK library, specifically in qagse.f which explicitly requires the functions/variables to be real before doing its "global adaptive quadrature based on 21-point Gauss–Kronrod quadrature within each subinterval, with acceleration by Peter Wynn's epsilon algorithm." So unless you want to try and modify the underlying FORTRAN to get it to handle complex numbers, compile it into a new library, you aren't going to get it to work.
If you really want to do the Gauss-Kronrod method with complex numbers in exactly one integration, look at wikipedias page and implement directly as done below (using 15-pt, 7-pt rule). Note, I memoize'd function to repeat common calls to the common variables (assuming function calls are slow as if the function is very complex). Also only did 7-pt and 15-pt rule, since I didn't feel like calculating the nodes/weights myself and those were the ones listed on wikipedia, but getting reasonable errors for test cases (~1e-14)
import scipy
from scipy import array
def quad_routine(func, a, b, x_list, w_list):
c_1 = (b-a)/2.0
c_2 = (b+a)/2.0
eval_points = map(lambda x: c_1*x+c_2, x_list)
func_evals = map(func, eval_points)
return c_1 * sum(array(func_evals) * array(w_list))
def quad_gauss_7(func, a, b):
x_gauss = [-0.949107912342759, -0.741531185599394, -0.405845151377397, 0, 0.405845151377397, 0.741531185599394, 0.949107912342759]
w_gauss = array([0.129484966168870, 0.279705391489277, 0.381830050505119, 0.417959183673469, 0.381830050505119, 0.279705391489277,0.129484966168870])
return quad_routine(func,a,b,x_gauss, w_gauss)
def quad_kronrod_15(func, a, b):
x_kr = [-0.991455371120813,-0.949107912342759, -0.864864423359769, -0.741531185599394, -0.586087235467691,-0.405845151377397, -0.207784955007898, 0.0, 0.207784955007898,0.405845151377397, 0.586087235467691, 0.741531185599394, 0.864864423359769, 0.949107912342759, 0.991455371120813]
w_kr = [0.022935322010529, 0.063092092629979, 0.104790010322250, 0.140653259715525, 0.169004726639267, 0.190350578064785, 0.204432940075298, 0.209482141084728, 0.204432940075298, 0.190350578064785, 0.169004726639267, 0.140653259715525, 0.104790010322250, 0.063092092629979, 0.022935322010529]
return quad_routine(func,a,b,x_kr, w_kr)
class Memoize(object):
def __init__(self, func):
self.func = func
self.eval_points = {}
def __call__(self, *args):
if args not in self.eval_points:
self.eval_points[args] = self.func(*args)
return self.eval_points[args]
def quad(func,a,b):
''' Output is the 15 point estimate; and the estimated error '''
func = Memoize(func) # Memoize function to skip repeated function calls.
g7 = quad_gauss_7(func,a,b)
k15 = quad_kronrod_15(func,a,b)
# I don't have much faith in this error estimate taken from wikipedia
# without incorporating how it should scale with changing limits
return [k15, (200*scipy.absolute(g7-k15))**1.5]
Test case:
>>> quad(lambda x: scipy.exp(1j*x), 0,scipy.pi/2.0)
[(0.99999999999999711+0.99999999999999689j), 9.6120083407040365e-19]
I don't trust the error estimate -- I took something from wiki for recommended error estimate when integrating from [-1 to 1] and the values don't seem reasonable to me. E.g., the error above compared with truth is ~5e-15 not ~1e-19. I'm sure if someone consulted num recipes, you could get a more accurate estimate. (Probably have to multiple by (a-b)/2 to some power or something similar).
Recall, the python version is less accurate than just calling scipy's QUADPACK based integration twice. (You could improve upon it if desired).
I realize I'm late to the party, but perhaps quadpy (a project of mine) can help. This
import quadpy
import numpy
val, err = quadpy.quad(lambda x: numpy.exp(1j * x), 0, 1)
print(val)
correctly gives
(0.8414709848078964+0.4596976941318605j)

scipy.optimize get's trapped in local minima. What can I do?

from numpy import *; from scipy.optimize import *; from math import *
def f(X):
x=X[0]; y=X[1]
return x**4-3.5*x**3-2*x**2+12*x+y**2-2*y
bnds = ((1,5), (0, 2))
min_test = minimize(f,[1,0.1], bounds = bnds);
print(min_test.x)
My function f(X)has a local minima at x=2.557, y=1 which I should be able to find.
The code showed above will only give result where x=1. I have tried with different tolerance and alle three method: L-BFGS-B, TNC and SLSQP.
This is the thread I have been looking at so far:
Scipy.optimize: how to restrict argument values
How can I fix this?
I am using Spyder(Python 3.6).
You just encounterd the problem with local optimization: it strongly depends on the start (initial) values you pass in. If you supply [2, 1] it will find the correct minima.
Common solutions are:
use your optimization in a loop with random starting points inside your boundaries
import numpy as np
from numpy import *; from scipy.optimize import *; from math import *
def f(X):
x=X[0]; y=X[1]
return x**4-3.5*x**3-2*x**2+12*x+y**2-2*y
bnds = ((1,3), (0, 2))
for i in range(100):
x_init = np.random.uniform(low=bnds[0][0], high=bnds[0][1])
y_init = np.random.uniform(low=bnds[1][0], high=bnds[1][1])
min_test = minimize(f,[x_init, y_init], bounds = bnds)
print(min_test.x, min_test.fun)
use an algorithm that can break free of local minima, I can recommend scipy's basinhopping()
use a global optimization algorithm and use it's result as initial value for a local algorithm. Recommendations are NLopt's DIRECT or the MADS algorithms (e.g. NOMAD). There is also another one in scipy, shgo, that I have no tried yet.
Try scipy.optimize.basinhopping. It simply just repeat your minimize procedure multiple times and get multiple local minimums. The minimal one is the global minimum.
minimizer_kwargs = {"method": "L-BFGS-B"}
res=optimize.basinhopping(nethedge,guess,niter=100,minimizer_kwargs=minimizer_kwargs)

Speeding up Evaluation of Sympy Symbolic Expressions

A Python program I am currently working on (Gaussian process classification) is bottlenecking on evaluation of Sympy symbolic matrices, and I can't figure out what I can, if anything, do to speed it up. Other parts of the program I've already ensured are typed properly (in terms of numpy arrays) so calculations between them are properly vectorised, etc.
I looked into Sympy's codegen functions a bit (autowrap, binary_function) in particular, but because my within my ImmutableMatrix object itself are partial derivatives over elements of a symbolic matrix, there is a long list of 'unhashable' things which prevent me from using the codegen functionality.
Another possibility I looked into was using Theano - but after some initial benchmarks, I found that while it build the initial partial derivative symbolic matrices much quicker, it seemed to be a few orders of magnitude slower at evaluation, the opposite of what I was seeking.
Below is a working, extracted snippet of the code I am currently working on.
import theano
import sympy
from sympy.utilities.autowrap import autowrap
from sympy.utilities.autowrap import binary_function
import numpy as np
import math
from datetime import datetime
# 'Vectorized' cdist that can handle symbols/arbitrary types - preliminary benchmarking put it at ~15 times faster than python list comprehension, but still notably slower (forgot at the moment) than cdist, of course
def sqeucl_dist(x, xs):
m = np.sum(np.power(
np.repeat(x[:,None,:], len(xs), axis=1) -
np.resize(xs, (len(x), xs.shape[0], xs.shape[1])),
2), axis=2)
return m
def build_symbolic_derivatives(X):
# Pre-calculate derivatives of inverted matrix to substitute values in the Squared Exponential NLL gradient
f_err_sym, n_err_sym = sympy.symbols("f_err, n_err")
# (1,n) shape 'matrix' (vector) of length scales for each dimension
l_scale_sym = sympy.MatrixSymbol('l', 1, X.shape[1])
# K matrix
print("Building sympy matrix...")
eucl_dist_m = sqeucl_dist(X/l_scale_sym, X/l_scale_sym)
m = sympy.Matrix(f_err_sym**2 * math.e**(-0.5 * eucl_dist_m)
+ n_err_sym**2 * np.identity(len(X)))
# Element-wise derivative of K matrix over each of the hyperparameters
print("Getting partial derivatives over all hyperparameters...")
pd_t1 = datetime.now()
dK_df = m.diff(f_err_sym)
dK_dls = [m.diff(l_scale_sym) for l_scale_sym in l_scale_sym]
dK_dn = m.diff(n_err_sym)
print("Took: {}".format(datetime.now() - pd_t1))
# Lambdify each of the dK/dts to speed up substitutions per optimization iteration
print("Lambdifying ")
l_t1 = datetime.now()
dK_dthetas = [dK_df] + dK_dls + [dK_dn]
dK_dthetas = sympy.lambdify((f_err_sym, l_scale_sym, n_err_sym), dK_dthetas, 'numpy')
print("Took: {}".format(datetime.now() - l_t1))
return dK_dthetas
# Evaluates each dK_dtheta pre-calculated symbolic lambda with current iteration's hyperparameters
def eval_dK_dthetas(dK_dthetas_raw, f_err, l_scales, n_err):
l_scales = sympy.Matrix(l_scales.reshape(1, len(l_scales)))
return np.array(dK_dthetas_raw(f_err, l_scales, n_err), dtype=np.float64)
dimensions = 3
X = np.random.rand(50, dimensions)
dK_dthetas_raw = build_symbolic_derivatives(X)
f_err = np.random.rand()
l_scales = np.random.rand(3)
n_err = np.random.rand()
t1 = datetime.now()
dK_dthetas = eval_dK_dthetas(dK_dthetas_raw, f_err, l_scales, n_err) # ~99.7%
print(datetime.now() - t1)
In this example, 5 50x50 symbolic matrices are evaluated, i.e. only 12,500 elements, taking 7 seconds. I've spent quite some time looking for resources on speeding operations like this up, and trying to translate it into Theano (at least until I found its evaluation slower in my case) and having no luck there either.
Any help greatly appreciated!

How to minimize a multivariable function using scipy

So I have the function
f(x) = I_0(exp(Q*x/nKT)
Where Q, K and T are constants, for the sake of clarity I'll add the values
Q = 1.6x10^(-19)
K = 1.38x10^(-23)
T = 77.6
and n and I_0 are the two constraints that I'm trying to minimize.
my xdata is a list of 50 datapoints and as is my ydata. So as of yet this is my code:
from __future__ import division
import scipy.optimize as optimize
import numpy
xdata = numpy.array([1.07,1.07994,1.08752,1.09355,
1.09929,1.10536,1.10819,1.11321,
1.11692,1.12099,1.12435,1.12814,
1.13181,1.13594,1.1382,1.14147,
1.14443,1.14752,1.15023,1.15231,
1.15514,1.15763,1.15985,1.16291,1.16482])
ydata = [0.00205,
0.004136,0.006252,0.008252,0.010401,
0.012907,0.014162,0.016498,0.018328,
0.020426,0.022234,0.024363,0.026509,
0.029024,0.030457,0.032593,0.034576,
0.036725,0.038703,0.040223,0.042352,
0.044289,0.046043,0.048549,0.050146]
#data and ydata is experimental data, xdata is voltage and ydata is current
def f(x,I0,N):
# I0 = 7.85E-07
# N = 3.185413895
Q = 1.66E-19
K = 1.38065E-23
T = 77.3692
return I0*(numpy.e**((Q*x)/(N*K*T))-1)
result = optimize.curve_fit(f, xdata,ydata) #trying to minize I0 and N
But the answer doesn't give suitably optimized constraints
Any help would be hugely appreciated I realize there may be something obvious I am missing, I just can't see what it is!
I have tried this, but for some reason if you throw out those constants so function becomes
def f(x,I0,N):
return I0*(numpy.exp(x/N)-1)
you get something reasonable.
1.86901114e-13, 4.41838309e-02
Its true, that when we get rid off constants its better. Define function as:
def f(x,A,B):
return A*(np.e**(B*x)-1)
and fit it by curve_fit, you'll be able to get A that is explicitly I0 (A=I0) and B (you can obtain N simply by N=Q/(BKT) ). I managed to get pretty good fit.
I think if there is too much constants, algorithm gets confused some way.

Categories

Resources