student t confidence interval in python

student t confidence interval in python - python

I am interested in using python to compute a confidence interval from a student t.
I am using the StudentTCI() function in Mathematica and now need to code the same function in python http://reference.wolfram.com/mathematica/HypothesisTesting/ref/StudentTCI.html
I am not quite sure how to build this function myself, but before I embark on that, is this function in python somewhere? Like numpy? (I haven't used numpy and my advisor advised not using numpy if possible).
What would be the easiest way to solve this problem? Can I copy the source code from the StudentTCI() in numpy (if it exists) into my code as a function definition?
edit: I'm going to need to build the Student TCI using python code (if possible). Installing scipy has turned into a dead end. I am having the same problem everyone else is having, and there is no way I can require Scipy for the code I distribute if it takes this long to set up.
Anyone know how to look at the source code for the algorithm in the scipy version? I'm thinking I'll refactor it into a python definition.

I guess you could use scipy.stats.t and its interval method:
In [1]: from scipy.stats import t
In [2]: t.interval(0.95, 10, loc=1, scale=2) # 95% confidence interval
Out[2]: (-3.4562777039298762, 5.4562777039298762)
In [3]: t.interval(0.99, 10, loc=1, scale=2) # 99% confidence interval
Out[3]: (-5.338545334351676, 7.338545334351676)
Sure, you can make your own function if you like. Let's make it look like in Mathematica:
from scipy.stats import t
def StudentTCI(loc, scale, df, alpha=0.95):
return t.interval(alpha, df, loc, scale)
print StudentTCI(1, 2, 10)
print StudentTCI(1, 2, 10, 0.99)
Result:
(-3.4562777039298762, 5.4562777039298762)
(-5.338545334351676, 7.338545334351676)

Related

Why does scipy bessel root finding not return roots at zero?

I am trying to use code which uses Bessel function zeros for other calculations. I noticed the following piece of code produces results that I consider unexpected.
import scipy
from scipy import special
scipy.special.jn_zeros(1,2)
I would expect the result from this call to be
array([0., 3.83170597])
instead of
array([3.83170597, 7.01558667])
Is there a reason a reason why the root at x=0.0 is not being returned?
From what I can see the roots are symmetric along the x-axis except for any found at the origin, but I do not think this would be enough of a reason to leave off the root completely.
The computer I am using has python version 2.7.10 installed and is using scipy version 0.19.0
P.S. the following function is what I am trying to find the zeros of
scipy.special.j1

It appears to be convention to not count the zero at zero, see for example ħere. Maybe it is considered redundant?

Python, scipy : minimize multivariable function in integral expression

how can I minimize a function (uncostrained), respect a[0] and a[1]?
example (this is a simple example for I uderstand scipy, numpy and py):
import numpy as np
from scipy.integrate import *
from scipy.optimize import *
def function(a):
return(quad(lambda t: ((np.cos(a[0]))*(np.sin(a[1]))*t),0,3))
i tried:
l=np.array([0.1,0.2])
res=minimize(function,l, method='nelder-mead',options={'xtol': 1e-8, 'disp': True})
but I get errors.
I get the results in matlab.
any idea ?
thanks in advance

This is just a guess, because you haven't included enough information in the question for anyone to really know what the problem is. Whenever you ask a question about code that generates an error, always include the complete error message in the question. Ideally, you should include a minimal, complete and verifiable example that we can run to reproduce the problem. Currently, you define function, but later you use the undefined function chirplet. That makes it a little bit harder for anyone to understand your problem.
Having said that...
scipy.integrate.quad returns two values: the estimate of the integral, and an estimate of the absolute error of the integral. It looks like you haven't taken this into account in function. Try something like this:
def function(a):
intgrl, abserr = quad(lambda t: np.cos(a[0])*np.sin(a[1])*t, 0, 3)
return intgrl

Autoregressive model using statsmodels in Python

I am trying to start using the AR models in statsmodels. However, I seem to be doing something wrong. Consider the following example, which fails:
from statsmodels.tsa.ar_model import AR
import numpy as np
signal = np.ones(20)
ar_mod = AR(signal)
ar_res = ar_mod.fit(4)
ar_res.predict(4, 60)
I think this should just continue the (trivial) time series consisting of ones. However, in this case it seems to return not enough parameters. len(ar_res.params) equals 4, while it should be 5. In the following example it works:
signal = np.ones(20)
signal[range(0, 20, 2)] = -1
ar_mod = AR(signal)
ar_res = ar_mod.fit(4)
ar_res.predict(4, 60)
I have the feeling that this could be a bug but I am not sure as I have no experience using the package. Maybe someone with more experience can help me...
EDIT: I have reported the issue here.

It works after adding a bit of noise, for example
signal = np.ones(20) + 1e-6 * np.random.randn(20)
My guess is that the constant is not added properly because of perfect collinearity with the signal.
You should open an issue to handle this corner case better. https://github.com/statsmodels/statsmodels/issues
My guess is also that the parameters are not identified in this case, so there might not be any good solution.
(Parameters not identified means that several parameter combinations can produce exactly the same fit, but I think they should all produce the same predictions in this case.)

Vectorization and Optimization of function in Python

I am fairly new to python and trying to transfer some code from matlab to python. I am trying to optimize a function in python using fmin_bfgs. I always try to vectorize the code when possible, but I ran into the following problem that I can't figure out. Here is a test example.
from pylab import *
from scipy.optimize import fmin_bfgs
## Create some linear data
L=linspace(0,10,100).reshape(100,1)
n=L.shape[0]
M=2*L+5
L=hstack((ones((n,1)),L))
m=L.shape[0]
## Define sum of squared errors as non-vectorized and vectorized
def Cost(theta,X,Y):
return 1.0/(2.0*m)*sum((theta[0]+theta[1]*X[:,1:2]-Y)**2)
def CostVec(theta,X,Y):
err=X.dot(theta)-Y
resid=err**2
return 1.0/(2.0*m)*sum(resid)
## Initialize the theta
theta=array([[0.0], [0.0]])
## Run the minimization on the two functions
print fmin_bfgs(Cost, x0=theta,args=(L,M))
print fmin_bfgs(CostVec, x0=theta,args=(L,M))
The first answer, with the unvectorized function, gives the correct answer which is just the vector [5, 2]. But, the the second answer, using the vectorizied form of the cost function returns roughly [15,0]. I have figured out the 15 doesn't appear from nowhere as it is 2 times the mean of the data plus the intercept, i.e., $2\times 5+5$. Any help is greatly appreciated.

Creating same random number sequence in Python, NumPy and R

Python, NumPy and R all use the same algorithm (Mersenne Twister) for generating random number sequences. Thus, theoretically speaking, setting the same seed should result in same random number sequences in all 3. This is not the case. I think the 3 implementations use different parameters causing this behavior.
R
>set.seed(1)
>runif(5)
[1] 0.2655087 0.3721239 0.5728534 0.9082078 0.2016819
Python
In [3]: random.seed(1)
In [4]: [random.random() for x in range(5)]
Out[4]:
[0.13436424411240122,
0.8474337369372327,
0.763774618976614,
0.2550690257394217,
0.49543508709194095]
NumPy
In [23]: import numpy as np
In [24]: np.random.seed(1)
In [25]: np.random.rand(5)
Out[25]:
array([ 4.17022005e-01, 7.20324493e-01, 1.14374817e-04,
3.02332573e-01, 1.46755891e-01])
Is there some way, where NumPy and Python implementation could produce the same random number sequence? Ofcourse as some comments and answers point out, one could use rpy. What I am specifically looking for is to fine tune the parameters in the respective calls in Python and NumPy to get the sequence.
Context: The concern comes from an EDX course offering in which R is used. In one of the forums, it was asked if Python could be used and the staff replied that some assignments would require setting specific seeds and submitting answers.
Related:
Comparing Matlab and Numpy code that uses random number generation From this it seems that the underlying NumPy and Matlab implementation are similar.
python vs octave random generator: This question does come fairly close to the intended answer. Some sort of wrapper around the default state generator is required.

use rpy2 to call r in python, here is a demo, the numpy array data is sharing memory with x in R:
import rpy2.robjects as robjects
data = robjects.r("""
set.seed(1)
x <- runif(5)
""")
print np.array(data)
data[1] = 1.0
print robjects.r["x"]

I realize this is an old question, but I've stumbled upon the same problem recently, and created a solution which can be useful to others.
I've written a random number generator in C, and linked it to both R and Python. This way, the random numbers are guaranteed to be the same in both languages since they are generated using the same C code.
The program is called SyncRNG and can be found here: https://github.com/GjjvdBurg/SyncRNG.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

student t confidence interval in python - python

Related

Why does scipy bessel root finding not return roots at zero?

Python, scipy : minimize multivariable function in integral expression

Autoregressive model using statsmodels in Python

Vectorization and Optimization of function in Python

Creating same random number sequence in Python, NumPy and R

Categories

Resources