How to apply a custom function to a variable in PyMC? - python

At one step in the model I'm writing, i have to calculate the error function of a quantity. What I'm trying to do looks like this:
from math import erf
import numpy as np
import pymc as pm
sig = pm.Exponential('sig', beta=0.1, size=10)
x = erf(sig ** 2)
This fails because erf doesn't work on arrays. I tried:
#pm.deterministic
def x(sig=sig):
return [erf(s) for s in sig]
but with no success, I know it's possible to get the result with:
np_erf = np.vectorize(erf)
x = np_erf((sig ** 2).value)
but this doesn't seem like the correct way because it doesn't produce a pm.Deterministic but just a np.array. How can I do it instead? (PyMC is version 2.3)
Edit: The above examples were simplified for clarity, here's the what the relevant passages look in the real code. Ideally, I would like this to work:
mu = pm.LinearCombination('mu', [...], [...])
sig2 = pm.exp(mu) ** 2
f = 1 / (pm.sqrt(np.pi * sig2 / 2.0) * erf(W / sig2))
but if fails with the message TypeError: only length-1 arrays can be converted to Python scalars. Going the np.vectorize route
np_erf = np.vectorize(erf)
f = 1 / (pm.sqrt(np.pi * sig2 / 2.0) * np_erf(W / sig2))
crashes with the same error message. The list comprehension
#pm.deterministic
def f(sig2=sig2):
return [1 / (pm.sqrt(np.pi * s / 2.0) * erf(W / s)) for s in sig2]
works as such, but leads to an error later in the code at this spot:
#pm.observed(plot=True)
def y(value=df['dist'], sig2=sig2, f=f):
return (np.log(np.exp(-(value ** 2) / 2.0 / sig2) * f)).sum()
and the error is AttributeError: log.
I've got the calculation of the error function working using a numerical approximation, which should mean that the general setup is correct. It would be just nicer and clearer to use the erf function directly.

I found the solution. I didn't realize that if you create a variable using the pymc.deterministic decorator, the parameters passed to the function are numpy.array, and not pymc.Distribution. And this allows to numpy.vectorize the function and apply it to the variable. So instead of
sig = pm.Exponential('sig', beta=0.1, size=10)
x = erf(sig ** 2)
you need to use
sig = pm.Exponential('sig', beta=0.1, size=10)
np_erf = np_vectorize(erf)
#pm.deterministic
def x(sig=sig):
return np_erf(sig ** 2)
and it works.

Related

Evaluating a function with a well-defined value at x,y=0

I am trying to write a program that uses an array in further calculations. I initialize a grid of equally spaced points with NumPy and assign a value at each point as per the code snippet provided below. The function I am trying to describe with this array gives me a division by 0 error at x=y and it generally blows up around it. I know that the real part of said function is bounded by band_D/(2*math.pi)
at x=y and I tried manually assigning this value on the diagonal, but it seems that points around it are still ill-behaved and so I am not getting any right values. Is there a way to remedy this? This is how the function looks like with matplotlib
gamma=5
band_D=100
Dt=1e-3
x = np.arange(0,1/gamma,Dt)
y = np.arange(0,1/gamma,Dt)
xx,yy= np.meshgrid(x,y)
N=x.shape[0]
di = np.diag_indices(N)
time_fourier=(1j/2*math.pi)*(1-np.exp(1j*band_D*(xx-yy)))/(xx-yy)
time_fourier[di]=band_D/(2*math.pi)
You have a classic 0 / 0 problem. It's not really Numpy's job to figure out to apply De L'Hospital and solve this for you... I see, as other have commented, that you had the right idea with trying to set the limit value at the diagonal (where x approx y), but by the time you'd hit that line, the warning had already been emitted (just a warning, BTW, not an exception).
For a quick fix (but a bit of a fudge), in this case, you can try to add a small value to the difference:
xy = xx - yy + 1e-100
num = (1j / 2*np.pi) * (1 - np.exp(1j * band_D * xy))
time_fourier = num / xy
This also reveals that there is something wrong with your limit calculation... (time_fourier[0,0] approx 157.0796..., not 15.91549...).
and not band_D / (2*math.pi).
For a correct calculation:
def f(xy):
mask = xy != 0
limit = band_D * np.pi/2
return np.where(mask, np.divide((1j/2 * np.pi) * (1 - np.exp(1j * band_D * xy)), xy, where=mask), limit)
time_fourier = f(xx - yy)
You are dividing by x-y, that will definitely throw an error when x = y. The function being well behaved here means that the Taylor series doesn't diverge. But python doesn't know or care about that, it just calculates one step at a time until it reaches division by 0.
You had the right idea by defining a different function when x = y (ie, the mathematically true answer) but your way of applying it doesn't work because the correction is AFTER the division by 0, so it never gets read. This, however, should work
def make_time_fourier(x, y):
if np.isclose(x, y):
return band_D/(2*math.pi)
else:
return (1j/2*math.pi)*(1-np.exp(1j*band_D*(x-y)))/(x-y)
time_fourier = np.vectorize(make_time_fourier)(xx, yy)
print(time_fourier)
You can use np.divide with where option.
import math
gamma=5
band_D=100
Dt=1e-3
x = np.arange(0,1/gamma,Dt)
y = np.arange(0,1/gamma,Dt)
xx,yy = np.meshgrid(x,y)
N = x.shape[0]
di = np.diag_indices(N)
time_fourier = (1j / 2 * np.pi) * (1 - np.exp(1j * band_D * (xx - yy)))
time_fourier = np.divide(time_fourier,
(xx - yy),
where=(xx - yy) != 0)
time_fourier[di] = band_D / (2 * np.pi)
You can reformulate your function so that the division is inside the (numpy) sinc function, which handles it correctly.
To save typing I'll use D for band_D and use a variable
z = D*(xx-yy)/2
Then
T = (1j/2*pi)*(1-np.exp(1j*band_D*(xx-yy)))/(xx-yy)
= (2/D)*(1j/2*pi)*( 1 - cos( 2*z) - 1j*sin( 2*z))/z
= (1j/D*pi)* (2*sin(z)*sin(z) - 2j*sin(z)*cos(z))/z
= (2j/D*pi) * sin(z)/z * (sin(z) - 1j*cos(z))
= (2j/D*pi) * sinc( z/pi) * (sin(z) - 1j*cos(z))
numpy defines
sinc(x) to be sin(pi*x)/(pi*x)
I can't run python do you should chrck my calculations
The steps are
Substitute the definition of z and expand the complex exp
Apply the double angle formulae for sin and cos
Factor out sin(z)
Substitute the definition of sinc

How to convert symbolic expression so it works with curve_fit?

so what I'm trying to do is to fit the specific model onto the dataset of x-y values and get constants of the model. I can get the constant (in this case there is only one) and then the fitted y_opt. Below is the working example of doing so:
import pandas as pd
from scipy.optimize import curve_fit
data = pd.read_csv(r'')
x_measured = data['x[-]'].values
y_measured = data['y[-]'].values
def y_NH(x_eng, D):
y_comp = D * x_eng*(x_eng**2 + 3 * x_eng + 3) / (1 + x_eng)**2
return y_comp
D = curve_fit(y_NH, x_measured, y_measured)
y_opt = y_NH(x_measured, D[0])
This works well, but it's not exactly good for me.
The formula for y_comp is something I needed to derive manually - originally I had other variable, let's say Y_comp, and got y_comp by differentiating Y_comp (by x_eng, obviously). What I would like to achieve is to feed my function with Y_comp (because there will be more like Z_comp, F_comp etc.), it would differentiate it resulting in y_comp (z_comp, f_comp) and then it would fit the model it onto my dataset - the result then would be constant(s) of the particular model.
I started with some work, but still I am not sufficient and would appreciate some help on this topic. The bugged code is:
import sympy as sy
from sympy.utilities.lambdify import lambdify
def y_NH2(x_eng, D):
lambda1 = sy.Symbol('lambda1')
x_eng = sy.Symbol('x_eng')
#Gi = sy.Symbol('Gi')
lambda1 = x_eng + 1
W = lambda1**2 + 2 / lambda1
y_comp_symb = sy.diff(W, x_eng)
y_comp = lambdify(x_eng, y_comp_symb,'numpy')
y_return = D / 2 * y_comp(x_eng)
return y_return
y_p = y_NH2(x_measured, 12)
print(y_p)
D = curve_fit(y_NH2, x_measured, y_measured)
y_opt = y_NH2(x_measured, D[0])
This raises an error in curve_fit that is: "error: Result from function call is not a proper array of floats."
Could you please give me a hint?

How do I get the value of gamma function in Python for large complex argument?

I am trying to compute the value of Beta function for complex argument. The method scipy.special.beta does not accept complex argument, so I defined instead
beta = lambda a, b: (gamma(a) * gamma(b)) / gamma(a + b)
It works fine for small values, however, for large values, it would return nan. So I digged into the behaviour of Gamma function
from scipy.special import gamma
import numpy
radius = 165
phi = (3.0 * numpy.pi) / 4.0
n = 1.9 + numpy.exp(phi * 1j) * radius
print gamma(n)
A 0j will be returned and clearly the value is too small to be printed.
However, though the value of the Gamma function is super small, the value of the corresponding Beta function is not quite. So it still makes sense to do the calculation. But I could not figure out a way.
I tried math.gamma, but it would not accept complex argument. I tried the method provided in this answer, it would return -0j for
n = 1.9 + numpy.exp(phi * 1j) * radius
numpy.exp(numpy.log(gamma(n)) + numpy.log(gamma(0.5)) - numpy.log(gamma(n + 0.5)))
where I was trying to compute beta(n, 0.5).
Could someone please help me on this issue? Thanks in advance!
I don't know about scipy, but you could use sympy for the evaluation of the beta function. It supports complex arguments. This documentation might help you with that.
So your code would roughly look like this if I understood you correctly:
from sympy.functions.special.beta_functions import beta
import numpy
radius = 165
phi = (3.0 * numpy.pi) / 4.0
n = 1.9 + numpy.exp(phi * 1j) * radius
print(beta(n, 0.5))
>>> 0.0534468376932947 - 0.127743871500741*I

Scipy stats.rv_continuous to find MLE from 2-dimensional data

Referring to here, I would like to find the MLE of alpha and lam, given the following PDF
import scipy.stats as st
import numpy as np
class Weib(st.rv_continuous):
def _pdf(self, data, alpha, lam):
t = data[0]
delta = data[1]
fx = (alpha * lam * (t**(alpha-1)))**(delta) * np.exp(-lam * (t**alpha))
return fx
def _argcheck(self, alpha, lam):
a = alpha > 0
l = lam > 0
return (a & l)
And I tried
Weib_inst = Weib(name='Weib')
Samples = Weib_inst.rvs(alpha=1, lam=3, size = 1000)
And it says
'float' object is not subscriptable
Weib_inst._fitstart([[1,2],[2,4]]) also returns the same error message.
It seems this occurs because the data is not 1-dimensional, but I cannot find the way to bypass this.
Any help might be appreciated.
You may try to define _fitstart in your subclass. The framework assumes univariate distributions, however.

Fitting a variable Sinc function in python

I would like to fit a sinc function to a bunch of datalines.
Using a gauss the fit itself does work but the data does not seem to be sufficiently gaussian, so I figured I could just switch to sinc..
I just tried to put together a short piece of self running code but realized, that I probably do not fully understand, how arrays are handled if handed over to a function, which could be part of the reason, why I get error messages calling my program
So my code currently looks as follows:
from numpy import exp
from scipy.optimize import curve_fit
from math import sin, pi
def gauss(x,*p):
print(p)
A, mu, sigma = p
return A*exp(-1*(x[:]-mu)*(x[:]-mu)/sigma/sigma)
def sincSquare_mod(x,*p):
A, mu, sigma = p
return A * (sin(pi*(x[:]-mu)*sigma) / (pi*(x[:]-mu)*sigma))**2
p0 = [1., 30., 5.]
xpos = range(100)
fitdata = gauss(xpos,p0)
p1, var_matrix = curve_fit(sincSquare_mod, xpos, fitdata, p0)
What I get is:
Traceback (most recent call last):
File "orthogonal_fit_test.py", line 18, in <module>
fitdata = gauss(xpos,p0)
File "orthogonal_fit_test.py", line 7, in gauss
A, mu, sigma = p
ValueError: need more than 1 value to unpack
From my understanding p is not handed over correctly, which is odd, because it is in my actual code. I then get a similar message from the sincSquare function, when fitted, which could probably be the same type of error. I am fairly new to the star operator, so there might be a glitch hidden...
Anybody some ideas? :)
Thanks!
You need to make three changes,
def gauss(x, A, mu, sigma):
return A*exp(-1*(x[:]-mu)*(x[:]-mu)/sigma/sigma)
def sincSquare_mod(x, A, mu, sigma):
x=np.array(x)
return A * (np.sin(pi*(x[:]-mu)*sigma) / (pi*(x[:]-mu)*sigma))**2
fitdata = gauss(xpos,*p0)
1, See Documentation
2, replace sin by the numpy version for array broadcasting
3, straight forward right? :P
Note, i think you are looking for p1, var_matrix = curve_fit(gauss,... rather than the one in the OP, which appears do not have a solution.
Also worth noting is that you will get rounding errors as x*Pi gets close to zero that might get magnified. You can approximate as demonstrated below for better results (VB.NET, sorry):
Private Function sinc(x As Double) As Double
x = (x * Math.PI)
'The Taylor Series expansion of Sin(x)/x is used to limit rounding errors for small values of x
If x < 0.01 And x > -0.01 Then
Return 1.0 - x ^ 2 / 6.0 + x ^ 4 / 120.0
End If
Return Math.Sin(x) / x
End Function
http://www.wolframalpha.com/input/?i=taylor+series+sin+%28x%29+%2F+x&dataset=&equal=Submit

Categories

Resources