I'm attempting to fit a function using Scipy's Orthogonal distance regression (odr) package and I keep getting the following error:
"RuntimeWarning: invalid value encountered in power"
this happened when I would use scipy's curve_fit function but I could always safely ignore the warning. But now it seems this is causing a numerical error that halts the fitting. I have based my code off of the example I found here:
python scipy.odrpack.odr example (with sample input / output)?
Here is my code:
import numpy as np
import scipy.odr.odrpack as odrpack
def divergence(x,xDiv):
return ( 1 - (x/xDiv) )**( -2.4 )
xValues = np.linspace(.25,.37,12)
yValues = np.array([ 6.94970607, 9.12475506, 10.65969954, 12.30241672,
14.44154148, 16.00261267, 19.98693664, 25.93076421,
30.89483997, 35.27106466, 50.81645983, 68.06009144])
xErrors = .0005*np.ones(len(xValues))
yErrors = np.array([ 0.31905094, 0.37956865, 0.24837562, 0.68320078, 1.25915789,
1.40241088, 0.33305157, 1.37165251, 0.32658393, 0.52253429,
1.04506858, 1.30633573])
wcModel = odrpack.Model(divergence)
mydata = odrpack.RealData(xValues, yValues, sx=xErrors, sy=yErrors)
myodr = odrpack.ODR(mydata, wcModel, beta0=[.8])
myoutput = myodr.run()
myoutput.pprint()
From looking at previous questions about this error I found here:
NumPy, RuntimeWarning: invalid value encountered in power
I suspected that the problem is that I'm raising a negatuve value to a power of a fractional value. But what I'm raising to the power -2.4 (1-x/xDiv) isn't negative (at least around the initial guess of xDiv=.8). But when I try to make my y-values of complex type I get a new error:
"ValueError: y could not be made into a suitable array"
from the line with the command
myoutput = myodr.run().
The only examples I can find that use this odr package are fitting to polynomials so I suspect that might be the problem?
Related
I have a complex function that includes (sin(x)/x). At x=0, this function has a limit of 1, but when numerically evaluated is NaN.
This function is vectorized to get high performance when evaluating a large number of values, but fails when x=0.
A simplified version of this problem is shown below.
import numpy as np
def f(x):
return np.sin(x)/x
x = np.arange(-5,6)
y = f(x)
print(y)
When executed, this returns:
... RuntimeWarning: invalid value encountered in true_divide
return np.sin(x)/x
[-0.19178485 -0.18920062 0.04704 0.45464871 0.84147098 nan
0.84147098 0.45464871 0.04704 -0.18920062 -0.19178485]
This can be addressed by trapping the error, finding nan and substituting for the limit.
Is there a better way to handle a function like this?
Note: The function is more complex than sin(x)/x. The limit is know. Theuse of sinc is not an option. sin(x)/s is used to illustrate the problem.
You could try to use true_divide with where to specify the place where you want to divide and out to pass in an out array that contains the result you expect at the places were you don't want to divide. Not sure if this is the most optimal solution but that would work. In code that should read liad
res = np.true_divide(sin(x), x, where=x!=0, out=np.ones_like(x))
I'm used to this option while I'm doing my plots:
x = np.arange(-5, 6, dtype=float)
domain = x!=0
fill_with = np.nan
f = np.divide(np.sin(x), x, out=np.full_like(x, fill_with), where=domain)
You can customize any domain and value to fill with outside the domain.
there's an implementation of numpy sinc that you can use.
I'm trying to do some approximate Bayesian computing, and am able to use the pm.Simulator class to estimate functions with 2 or more parameters (where each parameter is actually an array of multiple values). However, when I try to estimate values of a single parameter function, I get an error.
The simplest working example (loosely based on the actual code):
# 2 parameter pm.Simulator snippet that *works*
import pymc3 as pm
import numpy as np
def get_mean_sig2(mu,sigma):
multi_var = np.random.normal(mu,sigma)
return multi_var
# create the observed data
obs2 = get_mean_sig2(np.array([10,5,2,1]), np.array([0.5,1,2,1]))
with pm.Model() as m91:
mu = pm.Uniform('mu', lower=1, upper=15, shape=obs2.shape[0])
sigma = pm.Uniform('sigma',lower=0.25, upper=3,shape=obs2.shape[0])
sim = pm.Simulator('sim', get_mean_sig2,params=(mu,sigma),observed=obs2)
jj = pm.sample_smc(kernel='ABC')
When I remove the 'sigma' parameter, and simplify the problem to only estimating the mean with this code:
# 1 parameter pm.Simulator snippet that doesn't work
def get_only_mean(mu):
multi_var = np.random.normal(mu,0.2)
return multi_var
obs = get_only_mean(np.array([10,5,2,1]))
with pm.Model() as m90:
mu = pm.Uniform('mu', lower=1, upper=15, shape=obs.shape[0])
sim = pm.Simulator('sim', get_only_mean,params=(mu),observed=obs)
jj = pm.sample_smc(kernel='ABC')
I get the error message ValueError: Length of mu ~ Uniform cannot be determined . I have tried
variations of inputting shape=(1,obs.shape[0]) or manually setting shape=4 for the 'shape' parameter's input - but failed.
I'm unable to understand why this problem suddenly appears - any help would be appreciated.
My environment/system config is:
OS: Linux Mint 19.2
Python 3.8.5
numpy 1.19.5
pymc3 3.11.0
theano 1.1.0
The error disappears when the variable/s are put into a list rather than a tuple.
For the single-parameter example, using params=[mu] instead of params=(mu) solves the issue.
A list is a valid data-type for multi-parameter situations too - eg. params=(mu, sigma) is equivalent to params=[mu, sigma].
I am new to using Python but getting along with it fairly well. I keep getting the error you see below and not sure what the problem is exactly as I believe the values are correct and stated. What do you think the problem exactly is? I am trying to graph from t = 0 to t=PM, and the formula you see below is angle arccos.
Couldn't find the troubleshooting of this arccos error online. Running Python 3.5.
import numpy as np
import matplotlib
from matplotlib import pyplot
from __future__ import division
rE = 1.50*(10**11)
rM = 3.84*(10**8)
PE = 3.16*(10**7)
PM = 2.36*(10**6)
t = np.linspace(0, PM, 200)
# anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:1: RuntimeWarning: invalid value encountered in arccos
y = 0.5*(np.arccos(2*(np.pi)*t*((1/PM)-(1/PE))+90))
If you simplify to just
np.arccos(90)
(which is the first element in the array being passed to arccos), you'll get the same warning
Why is that? arccos() attempts to solve x for which cos(x) = 90. However, such a value doesn't make sense as it's outside of the possible domain for arccos [-1,1]
Also note that at least in recent versions of numpy, this calculation returns nan
>>> import numpy as np
>>> b = np.arccos(90)
__main__:1: RuntimeWarning: invalid value encountered in arccos
>>> b
nan
The np.arccos() function can only take values between -1 and 1, inclusive.
See: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.arccos.html
I'm trying to get a simple PyMC2 model working in PyMC3. I've gotten the model to run but the models give very different MAP estimates for the variables. Here is my PyMC2 model:
import pymc
theta = pymc.Normal('theta', 0, .88)
X1 = pymc.Bernoulli('X2', p=pymc.Lambda('a', lambda theta=theta:1./(1+np.exp(-(theta-(-0.75))))), value=[1],observed=True)
X2 = pymc.Bernoulli('X3', p=pymc.Lambda('b', lambda theta=theta:1./(1+np.exp(-(theta-0)))), value=[1],observed=True)
model = pymc.Model([theta, X1, X2])
mcmc = pymc.MCMC(model)
mcmc.sample(iter=25000, burn=5000)
trace = (mcmc.trace('theta')[:])
print "\nThe MAP value for theta is", trace.sum()/len(trace)
That seems to work as expected. I had all sorts of trouble figuring out how to use the equivalent of the pymc.Lambda object in PyMC3. I eventually came across the Deterministic object. The following is my code:
import pymc3
with pymc3.Model() as model:
theta = pymc3.Normal('theta', 0, 0.88)
X1 = pymc3.Bernoulli('X1', p=pymc3.Deterministic('b', 1./(1+np.exp(-(theta-(-0.75))))), observed=[1])
X2 = pymc3.Bernoulli('X2', p=pymc3.Deterministic('c', 1./(1+np.exp(-(theta-(0))))), observed=[1])
start=pymc3.find_MAP()
step=pymc3.NUTS(state=start)
trace = pymc3.sample(20000, step, njobs=1, progressbar=True)
pymc3.traceplot(trace)
The problem I'm having is that my MAP estimate for theta using PyMC2 is ~0.68 (correct), while the estimate PyMC3 gives is ~0.26 (incorrect). I suspect this has something to do with the way I'm defining the deterministic function. PyMC3 won't let me use a lambda function, so I just have to write the expression in-line. When I try to use lambda theta=theta:... I get this error:
AsTensorError: ('Cannot convert <function <lambda> at 0x157323e60> to TensorType', <type 'function'>)
Something to do with Theano?? Any suggestions would be greatly appreciated!
It works when you use a theano tensor instead of a numpy function in your Deterministic.
import pymc3
import theano.tensor as tt
with pymc3.Model() as model:
theta = pymc3.Normal('theta', 0, 0.88)
X1 = pymc3.Bernoulli('X1', p=pymc3.Deterministic('b', 1./(1+tt.exp(-(theta-(-0.75))))), observed=[1])
X2 = pymc3.Bernoulli('X2', p=pymc3.Deterministic('c', 1./(1+tt.exp(-(theta-(0))))), observed=[1])
start=pymc3.find_MAP()
step=pymc3.NUTS(state=start)
trace = pymc3.sample(20000, step, njobs=1, progressbar=True)
print "\nThe MAP value for theta is", np.median(trace['theta'])
pymc3.traceplot(trace);
Here's the output:
Just in case someone else has the same problem, I think I found an answer. After trying different sampling algorithms I found that:
find_MAP gave the incorrect answer
the NUTS sampler gave the incorrect answer
the Metropolis sampler gave the correct answer, yay!
I read somewhere else that the NUTS sampler doesn't work with Deterministic. I don't know why. Maybe that's the case with find_MAP too? But for now I'll stick with Metropolis.
Also, NUTS doesn't handle discrete variables. If you want to use NUTS, you have to split up the samplers:
step1 = pymc3.NUTS([theta])
step2 = pymc3.BinaryMetropolis([X1,X2])
trace = pymc3.sample(10000, [step1, step2], start)
EDIT:
Missed that 'b' and 'c' were defined inline. Removed them from the NUTS function call
The MAP value is not defined as the mean of a distribution, but as its maximum. With pymc2 you can find it with:
M = pymc.MAP(model)
M.fit()
theta.value
which returns array(0.6253614422469552)
This agrees with the MAP that you find with find_MAP in pymc3, which you call start:
{'theta': array(0.6253614811102668)}
The issue of which is a better sampler is a different one, and does not depend on the calculation of the MAP. The MAP calculation is an optimization.
See: https://pymc-devs.github.io/pymc/modelfitting.html#maximum-a-posteriori-estimates for pymc2.
I'm trying to fit Einstein approximation of resistivity in a solid in a set of experimental data.
I have resistivity vs temperature (from 200 to 4 K)
import xlrd as xd
import matplotlib.pyplot as plt
import numpy as np
import pylab as pl
import scipy as sp
from scipy.optimize import curve_fit
#retrieve data from file
data = pl.loadtxt('salita.txt')
Temp = data[:, 1]
Res = data[:, 2]
#define fitting function
def einstein_func( T, ro0, AE, TE):
nl = np.sinh(TE/(2*T))
return ro0 + AE*nl*T
p0 = sp.array([1 , 1, 1])
coeffs, cov = curve_fit(einstein_func, Temp, Res, p0)
But I get these warnings
crio.py:14: RuntimeWarning: divide by zero encountered in divide
nl = np.sinh(TE/(2*T))
crio.py:14: RuntimeWarning: overflow encountered in sinh
nl = np.sinh(TE/(2*T))
crio.py:15: RuntimeWarning: divide by zero encountered in divide
return ro0 + AE*np.sinh(TE/(2*T))*T
crio.py:15: RuntimeWarning: overflow encountered in sinh
return ro0 + AE*np.sinh(TE/(2*T))*T
crio.py:15: RuntimeWarning: invalid value encountered in multiply
return ro0 + AE*np.sinh(TE/(2*T))*T
Traceback (most recent call last):
File "crio.py", line 19, in <module>
coeffs, cov = curve_fit(einstein_func, Temp, Res, p0)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/optimize/minpack.py", line 511, in curve_fit
raise RuntimeError(msg)
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
I don't understand why it keeps saying that there is a divide by zero in sinh, since I have strictly positive values. Varying my starting guess has no effect on it.
EDIT: My dataset is organized like this:
4.39531E+0 1.16083E-7
4.39555E+0 -5.92258E-8
4.39554E+0 -3.79045E-8
4.39525E+0 -2.13213E-8
4.39619E+0 -4.02736E-8
4.43130E+0 -1.42142E-8
4.45900E+0 -2.60594E-8
4.46129E+0 -9.00232E-8
4.46181E+0 1.42142E-7
4.46195E+0 -2.13213E-8
4.46225E+0 4.26426E-8
4.46864E+0 -2.60594E-8
4.47628E+0 1.37404E-7
4.47747E+0 9.47612E-9
4.48008E+0 2.84284E-8
4.48795E+0 1.35035E-7
4.49804E+0 1.39773E-7
4.51151E+0 -1.75308E-7
4.54916E+0 -1.63463E-7
4.59176E+0 -2.36902E-9
where the first column is temperature and the second one is resistivity (negative values are due to noise in trial current since the sample is a PbIn alloy which becomes superconductive at temperature lower than 6.7-6.9K, here we are at 4.5K).
Argument I'm providing to sinh are Numpy arrays, with a linear function ro0 + AE*T my code works. I've tried with scipy.optimize.minimize but the result is the same.
Now I see that I have almost nine hundred values in my file, may that be the problem?
I have edited my dataset removing some lines and now the only warning showing is
RuntimeWarning: overflow encountered in sinh
How can I workaround it?
Here are a couple of observations that could help:
You could try the least-squares fit directly with leastsq, providing the Jacobian, which might help tame it.
I'm guessing you don't want the superconducting temperatures in your data set at all if you're fitting to an Einstein model (do you have a source for this eqn, btw?)
Do make sure your initial guesses are as good as they could possibly be (ro0=AE=TE=1 probably won't cut it).
Plot your data and make sure there aren't any weird artefacts
You seem to be indexing your data array in the wrong way in your code example: if the data is structured as you say, you want:
Temp = data[:, 0]
Res = data[:, 1]
(Python indexes start at 0).