How to use Zero-skewness log transform in Python - python

How to do zero-skewness log transform in Python?
For example in Stata it is implemented in lnskew0 (see https://www.stata.com/manuals13/rlnskew0.pdf).
I didn't find an implementation in Python. Is anyone aware of an implementation?
Otherwise, a first try would be:
from scipy.stats import skew
import numpy as np
from scipy.optimize import root_scalar
def lnskew0(x):
def skew_ln(k):
return skew(np.log(x - k))
res = root_scalar(
skew_ln,
bracket=[-x.min(), x.max()*0.99999],
method='bisect'
)
return np.log(x - res.root)
Works fine on numpy arrays with only positive numbers. How is Stata's lnskew0 implemented that it works with negative numbers as well?

I gave it another try so that it works with negative numbers as well:
from scipy.stats import skew
import numpy as np
from scipy.optimize import root_scalar
def lnskew0(x):
x0 = x + 1
def skew_ln_pos(k):
with np.errstate(all='ignore'):
return skew(np.log(x0 - k))
res_pos = root_scalar(
skew_ln_pos,
bracket=[-150, 150],
method='bisect'
)
def skew_ln_neg(k):
with np.errstate(all='ignore'):
return skew(np.log(-x0 - k))
res_neg = root_scalar(
skew_ln_neg,
bracket=[-150, 150],
method='bisect'
)
res = (res_pos.root - 1, res_neg.root + 1)
lnskew0_res = (
np.log(x - res[0]),
np.log(-x - res[1])
)
whichmin = np.nanargmin([abs(skew(x)) for x in lnskew0_res])
return lnskew0_res[whichmin]
Note: It still has one issue. The bracket of the root_scalar needs to be choosen manually.

Related

How to transform a np.array into a function?

I have the following np.array which describes a rectangular wave. I would like to transform it into a callable function with a continuous argument. The np.array is:
import matplotlib.pyplot as plt
import numpy as np
from numpy.random import seed
from numpy.random import rand
def piecewise_control( off_times,amp_inj, period_inj ):
def select(T):
return lambda t: (-T/2 <= t) & (t < T/2)
def pulse_train(t, at, shape):
return np.sum(shape(t - at[:,np.newaxis]), axis=0)
for i in range(1,len(off_times)):
off_times[i] += off_times[i-1] + period_inj
return amp_inj*pulse_train(t,off_times,shape=select(period_inj))
t=np.linspace(0,100,10000)
off_times = 10*rand(10)
period_inj = 1
amp_inj = 1
control = piecewise_control( off_times,amp_inj, period_inj )
plt.plot(t,control)
plt.show()
This answer inspired me.
The plot is the following:
The question is: can we transform the array control into a function with a continuous argument?
Of course if we did:
def ccontrol(t, control):
return control[t]
unfortunately we would get a function which only depends on integers.
You can subclass numpy.ndarray and implement the __call__ method:
import numpy as np
class MyArray(np.ndarray):
def __call__(self, idx):
return self[idx]
control = np.random.rand(100)
control_view = control.view(MyArray)
print(control_view(5), control[5])
For interpolation you can use scipy.interpolate. In fact, interpolation routines can return functions that you can call with any input, not necessarily integers.

Multiprocessing and scipy (dblquad)

I am trying to speed up the following code in python:
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
from scipy import integrate
import camb
from tqdm import tqdm
import os
#Reading a PS
dir = os.getcwd()
data = np.loadtxt(dir+"/ps1-peacock.txt")
kh = data[:,0]
p_lin = data[:,1]
p_nlin = data[:,2]
p_linear = interpolate.interp1d(kh,p_lin)
#Integrand of P22
def upper_mu(x):
return min(1.0,(kk**2 + np.exp(2*x))/(2*kk*np.exp(x)))
def lower_mu(x):
return max(-1.0,-(kk**2+np.exp(x))/(2*kk*np.exp(x)))
def mulow(x):
return max(-1.0,(kh[-1]**2.0-kk**2.0-np.exp(x)**2.0)/(-2.0*kk*np.exp(x)))
def muhigh(x):
return min(1.0,(kh[0]**2.0-kk**2.0-np.exp(x)**2.0)/(-2.0*kk*np.exp(x)))
def f22(mu,q,k):
r = np.exp(q)/k
F = (7.0*mu+(3.0-10.0*mu**2)*r)/(14.0*r*(r**2-2.0*mu*r+1.0))
psik = (k**2+np.exp(2*q)-2.0*k*mu*np.exp(q))**0.5
if (psik>kh[0] and psik<kh[-1]):
return 1.0/2.0/np.pi**2.0*np.exp(3*q)*p_linear(np.exp(q))*p_linear(psik)*F**2
else:
return 0
P22 = np.zeros_like(kh)
error = np.zeros_like(kh)
for i in tqdm(range(0,np.shape(kh)[0])):
kk = kh[i]
P22[i], error[i] = integrate.dblquad(f22,np.log(kh[0]),np.log(kh[-1]),mulow,muhigh,args=(kh[i],),epsrel=1e-3, epsabs=50)[:2]
Here follows the integral in text for reasons of clarity:
I would like to use multiprocessing to improve the performance of dblquad(). Does anyone know how can I implement it in this specific case?
Multiprocessing won't help here, you cannot split the dblquad work between python processes.
If you have several integrals to compute, then yes, you can split integrals between processes. Whether this is worth it strongly depends on the amount of work there is for each process.

Is there a multiple integrator in Python providing both variable integration limits (like scipy) and high precision (like mpmath)?

I can use scipy quad and nquad for a quadruple integration involving variable integration limits. The problem is that the default precision used raises an Error when the tolerance requested cannot be achieved. With mpmath integrator, I can define any arbitrary precision with setting mp.dps = arbitrary, but I can't see if and how the limits can become variable like with nquad. Mpmath also provides a very fast execution with Gauss-Legendre method in quadgl, which is highly desirable, because my function is smooth, but takes an exorbitant amount of time with scipy to complete four integrations. Please help.
The below is only a simple function that fails my goal:
from datetime import datetime
import scipy
from scipy.special import jn, jn_zeros
import numpy as np
import matplotlib.pyplot as plt
from mpmath import *
from mpmath import mp
from numpy import *
from scipy.optimize import *
# Set the precision
mp.dps = 15#; mp.pretty = True
# Setup shortcuts, so we can just write exp() instead of mp.exp(), etc.
F = mp.mpf
exp = mp.exp
sin = mp.sin
cos = mp.cos
asin = mp.asin
acos = mp.acos
sqrt = mp.sqrt
pi = mp.pi
tan = mp.tan
start = datetime.now()
print(start)
#optionsy={'limit':100, 'epsabs':1.49e-1, 'epsrel':1.49e-01}
#optionsx={'limit':100, 'epsabs':1.49e-1, 'epsrel':1.49e-01}
def f(x,y,z):
return 2*sqrt(1-x**2) + y**2.0 + z
def rangex(y,z):
return [-1,1]
def rangey(z):
return [1,2]
def rangez():
return [2,3]
def result():
return quadgl(f, rangex, rangey, rangez)
"""
#The below works:
def result():
return quadgl(f, [-1,1], [1,2], [2,3])
"""
print(result())
end = datetime.now()
print(end-start)
Ok, let me put something in answer, hard to put code in the comments
Simple optimization with MP math is to follow simple rules:
y2.0 is VERY expensive (log, exp, ...), replace with y*y
y2 is still expensive, replace with y*y
multiplication is a lot more expensive than summation, replace x*y + y**2.0 with (x+y)*y
Division is more expensive than multiplication, replace y/4 with 0.25*y
Code, Win 10 x64, Python 3.8
def f3():
def f2(x):
def f1(x,y):
def f(x,y,z):
return 1.0 + (x+y)*y + 3.0*z
return mpmath.quadgl(f, [-1.0, 1], [1.2*x, 1.0], [0.25*y, x*x])
return mpmath.quadgl(f1, [-1, 1.0], [1.2*x, 1.0])
return mpmath.quadgl(f2, [-1.0, 1.0])
on my computer went from 12.9 sec to 10.6 sec, about 20% off
Below is a simple example of how I can do only triple integration with mpmath. This does not address high precision with four integrations. In any case, execution time is even a bigger problem. Any help welcome.
from datetime import datetime
import scipy
import numpy as np
from mpmath import *
from mpmath import mp
from numpy import *
# Set the precision
mp.dps = 20#; mp.pretty = True
# Setup shortcuts, so we can just write exp() instead of mp.exp(), etc.
F = mp.mpf
exp = mp.exp
sin = mp.sin
cos = mp.cos
asin = mp.asin
acos = mp.acos
sqrt = mp.sqrt
pi = mp.pi
tan = mp.tan
start = datetime.now()
print('start: ',start)
def f3():
def f2(x):
def f1(x,y):
def f(x,y,z):
return 1.0 + x*y + y**2.0 + 3.0*z
return quadgl(f, [-1.0, 1], [1.2*x, 1.0], [y/4, x**2.0])
return quadgl(f1, [-1, 1.0], [1.2*x, 1.0])
return quadgl(f2, [-1.0, 1.0])
print('result =', f3())
end = datetime.now()
print('duration in mins:',end-start)
#start: 2020-08-19 17:05:06.984375
#result = 5.0122222222222221749
#duration: 0:01:35.275956
Furthermore, an attempt to combine one (first) scipy integration followed by a triple mpmath integrator does not seem to produce any output for more than 24 hours even with a simplest function. What is wrong with the following code?
from datetime import datetime
import scipy
import numpy as np
from mpmath import *
from mpmath import mp
from numpy import *
from scipy import integrate
# Set the precision
mp.dps = 15#; mp.pretty = True
# Setup shortcuts, so we can just write exp() instead of mp.exp(), etc.
F = mp.mpf
exp = mp.exp
sin = mp.sin
cos = mp.cos
asin = mp.asin
acos = mp.acos
sqrt = mp.sqrt
pi = mp.pi
tan = mp.tan
start = datetime.now()
print('start: ',start)
#Function to be integrated
def f(x,y,z,w):
return 1.0 + x + y + z + w
#Scipy integration:FIRST INTEGRAL
def f0(x,y,z):
return integrate.quad(f, -20, 10, args=(x,y,z), epsabs=1.49e-12, epsrel=1.4e-8)[0]
#Mpmath integrator of function f0(x,y,z): THREE OUTER INTEGRALS
def f3():
def f2(x):
def f1(x,y):
return quadgl(f0, [-1.0, 1], [-2, x], [-10, y])
return quadgl(f1, [-1, 1.0], [-2, x])
return quadgl(f2, [-1.0, 1.0])
print('result =', f3())
end = datetime.now()
print('duration:', end-start)
Below is the full code, for which the original question was raised. It contains the use of scipy to carry out four integrations:
# Imports
from datetime import datetime
import scipy.integrate as si
import scipy
from scipy.special import jn, jn_zeros
from scipy.integrate import quad
from scipy.integrate import nquad
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import fixed_quad
from scipy.integrate import quadrature
from mpmath import mp
from numpy import *
from scipy.optimize import *
# Set the precision
mp.dps = 30
# Setup shortcuts, so we can just write exp() instead of mp.exp(), etc.
F = mp.mpf
exp = mp.exp
sin = mp.sin
cos = mp.cos
asin = mp.asin
acos = mp.acos
sqrt = mp.sqrt
pi = mp.pi
tan = mp.tan
start = datetime.now()
print(start)
R1 = F(6.37100000000000e6)
k1 = F(8.56677817058932e-8)
R2 = F(1.0)
k2 = F(5.45789437248245e-01)
r = F(12742000.0)
#Replace computed initial constants with values presuming is is faster, like below:
#a2 = R2/r
#print(a2)
a2 = F(0.0000000784806152880238581070475592529)
def u1(phi2):
return r*cos(phi2)-r*sqrt(a2**2.0-(sin(phi2))**2.0)
def u2(phi2):
return r*cos(phi2)+r*sqrt(a2**2.0-(sin(phi2))**2.0)
def om(u,phi2):
return u-r*cos(phi2)
def mp2(phi2):
return r*sin(phi2)
def a1(u):
return R1/u
optionsx={'limit':100, 'epsabs':1.49e-14, 'epsrel':1.49e-11}
optionsy={'limit':100, 'epsabs':1.49e-14, 'epsrel':1.49e-10}
#---- in direction u
def a1b1_u(x,y,u):
return 2.0*u*sqrt(a1(u)**2.0-(sin(y))**2.0)
def oa2_u(x,y,u,phi2):
return (mp2(phi2)*sin(y)*cos(x)+om(u,phi2)*cos(y)
- sqrt((mp2(phi2)*sin(y)*cos(x)+om(u,phi2)*(cos(y)))**2.0
+ R2**2.0-om(u,phi2)**2.0-mp2(phi2)**2.0))
def ob2_u(x,y,u,phi2):
return (mp2(phi2)*sin(y)*cos(x)+om(u,phi2)*cos(y)
+ sqrt((mp2(phi2)*sin(y)*cos(x)+om(u,phi2)*(cos(y)))**2.0
+ R2**2.0-om(u,phi2)**2.0-mp2(phi2)**2.0))
def func1_u(x,y,u,phi2):
return (-exp(-k1*a1b1_u(x,y,u)-k2*ob2_u(x,y,u,phi2))+exp(+k2*oa2_u(x,y,u,phi2)))*sin(y)*cos(y)
#--------joint_coaxial integration: u1
def fg_u1(u,phi2):
return nquad(func1_u, [[-pi, pi], [0, asin(a1(u))]], args=(u,phi2), opts=[optionsx,optionsy])[0]
#Constants to be used for normalization at the end or in the interim inegrals if this helps adjust values for speed of execution
piA1 = pi*(R1**2.0-1.0/(2.0*k1**2.0)+exp(-2.0*k1*R1)*(2.0*k1*R1+1.0)/(2.0*k1**2.0))
piA2 = pi*(R2**2.0-1.0/(2.0*k2**2.0)+exp(-2.0*k2*R2)*(2.0*k2*R2+1.0)/(2.0*k2**2.0))
#----THIRD integral of u1
def third_u1(u,phi2):
return fg_u1(u,phi2)*u**2.0
def third_u1_I(phi2):
return quad(third_u1, u1(phi2), u2(phi2), args = (phi2), epsabs=1.49e-20, epsrel=1.49e-09)[0]
#----FOURTH integral of u1
def fourth_u1(phi2):
return third_u1_I(phi2)*sin(phi2)*cos(phi2)
def force_u1():
return quad(fourth_u1, 0.0, asin(a2), args = (), epsabs=1.49e-20, epsrel=1.49e-08)[0]
force_u1 = force_u1()*r**2.0*2.0*pi*k2/piA1/piA2
print('r = ', r, 'force_u1 =', force_u1)
end = datetime.now()
print(end)
args = {
'p':r,
'q':force_u1,
'r':start,
's':end
}
#to txt file
f=open('Sphere-test-force-u-joint.txt', 'a')
f.write('\n{p},{q},{r},{s}'.format(**args))
#f.flush()
f.close()
I am interested in setting the epsrel sufficiently low, depending on the case. The epsabs is generally unknown apriori, so I understand that I should make it very low to avoid it taking hold of the output, in which case it introduces an computational articact. When I make it lower, an Error warning is raised that the round-off errors are significant and the total error may be underestimated for the desired tolerance to be achieved.
Whilst the question is not about speed, the latter is intimately connected with making practical the execution of a quadruple integration prior to the inquiry about precision and tolerance. To test the speed, I set (increased) all four epsrel=1e-02, which reduced the time of the original code down to 2:14 (hours). Then I simplified powers per Severin and implemented some memoization. These reduced the time cumulatively down to 1:29 (hours). The edited lines of the code are provided here:
from memoization import cached
#cached(ttl=10)
def u1(phi2):
return r*cos(phi2)-r*sqrt(a2*a2-sin(phi2)*sin(phi2))
#cached(ttl=10)
def u2(phi2):
return r*cos(phi2)+r*sqrt(a2*a2-sin(phi2)*sin(phi2))
#cached(ttl=10)
def om(u,phi2):
return u-r*cos(phi2)
#cached(ttl=10)
def mp2(phi2):
return r*sin(phi2)
#cached(ttl=10)
def a1(u):
return R1/u
optionsx={'limit':100, 'epsabs':1.49e-14, 'epsrel':1.49e-02}
optionsy={'limit':100, 'epsabs':1.49e-14, 'epsrel':1.49e-02}
def a1b1_u(x,y,u):
return 2.0*u*sqrt(a1(u)*a1(u)-sin(y)*sin(y))
def oa2_u(x,y,u,phi2):
return (mp2(phi2)*sin(y)*cos(x)+om(u,phi2)*cos(y)
- sqrt((mp2(phi2)*sin(y)*cos(x)+om(u,phi2)*(cos(y)))**2.0
+ 1.0-om(u,phi2)*om(u,phi2)-mp2(phi2)*mp2(phi2)))
def ob2_u(x,y,u,phi2):
return (mp2(phi2)*sin(y)*cos(x)+om(u,phi2)*cos(y)
+ sqrt((mp2(phi2)*sin(y)*cos(x)+om(u,phi2)*(cos(y)))**2.0
+ 1.0-om(u,phi2)*om(u,phi2)-mp2(phi2)*mp2(phi2)))
def third_u1(u,phi2):
return fg_u1(u,phi2)*u*u
def third_u1_I(phi2):
return quad(third_u1, u1(phi2), u2(phi2), args = (phi2), epsabs=1.49e-20, epsrel=1.49e-02)[0]
def force_u1():
return quad(fourth_u1, 0.0, asin(a2), args = (), epsabs=1.49e-20, epsrel=1.49e-02)[0]
However, the output is an artifact caused by the inadequate tolerance introduced. I can progressively set the epsrel to lower values and see if the result converges to a realistic value in realistic time with the available scipy precision. Hope this illustrates the original question much better.

Jacobi Method & Basic Matrix Math using NUMPY

I'm getting an import error for "norm". What am I not doing correct??
I'm open to constructive feedback on improving the code, however I have to keep the parameters as they are!
Thanks!!!
Code is below:
import numpy as np
from numpy import norm, inalg, array, zeros, diag, diagflat, dot, linalg
"""Test Case Data"""
A = np.matrix([[4,-1,-1],[-2,6,1],[-1,1,7]])
b = np.matrix([[3],[9],[-6]])
x = np.matrix([[0],[0],[0]])
"""Main Function"""
def jacobi(A, b, x, Tolerance, Iterations):
V = np.diag(A)
D = np.diag(V)
R = D-A
D_I = D.I
D = np.asmatrix(D)
Counter_1 = 1
tol_gauge = 100
while Counter_1 <= Iterations:
# I considered using the "dot" function in NUMPY but I was wary of mixed results
iterative_approach_form = D_I * ((R*x)+b)
tol_gauge = np.linalg.norm(iterative_approach_form-x)
x = iterative_approach_form
if initial_tol <= Tolerance:
return("The Solution x = {},y={}, z={} ".format(x[0], x[1], x[2]))
return("The Solution was found in %s interation(s)" %(Counter_1))
else:
pass
Counter_1 +=1
return("The Solution was not found in {} iteration(s)".format(Iterations))
You need to specify which numpy module you are importing from. The following works if you want to use a function only by its name:
from numpy import linalg
from numpy.linalg import norm
from numpy import zeros, array, diag, diagflat, dot
Looking at you code however, you don't need the second import line, because in the rest of the code the numpy functions are specified according to the accepted norm. For example, norm is already present in your code as np.linalg.norm.
There are three more issues with your code: 1) initial_tol is not assigned a value; 2) tol_gauge is assigned but not used in the code; 3) the last return statement is not indented properly (perhaps only here) and the same is very likely for the block in your while loop.

solve an integral equation by python

I need to solve an integral equation by python 3.2 in win7.
I want to find an initial guess solution first and then use "fsolve()" to solve it in python.
This is the code:
import numpy as np
from scipy.optimize.minpack import fsolve
from cmath import cos, exp
from scipy.integrate.quadpack import quad
def integrand2(x, b):
return exp(-x)/b
def intergralFunc2(b):
integral,err = quad(integrand2, 0, 10, args=(b)) // **error here**
return 0.01 - integral
import matplotlib.pyplot as plt
def findGuess():
vfunc = np.vectorize(intergralFunc2)
f = np.linspace(-20, 20,10)
plt.plot(f, vfunc(f))
plt.xlabel('guess value')
plt.show()
def solveFunction():
y= fsolve(intergralFunc2, 10)
return y
if __name__ == '__main__':
findGuess()
solution = solveFunction()
print("solution is ", solution)
I got error:
quadpack.error: Supplied function does not return a valid float.
Any help would be appreciated.
Just made the following change and it should work (it worked for me).
remove:
from cmath import exp, cos
include:
from numpy import exp, cos
as explained in the comments, the cmath functions accept only float inputs, not arrays.

Categories

Resources