The function scipy.stats.linregress automatically calculates the standard error of the fitted slope. How do I get the standard error of the fitted intercept?
One alternative would be to use the pyfinance.ols module, which has separate standard error attributes for the intercept (alpha) and other coefficients. Disclosure: I wrote this module. It was uploaded to PyPI recently for easier install.
A quick example:
from pyfinance.ols import OLS
x = np.random.randn(50)
y = np.random.beta(1, 2, 50)
model = OLS(y=y, x=x)
model.se_alpha
# 0.029413047270740914
Under the hood, the class is adding a column vector of ones, and the alpha/intercept term is just a normal coefficient to this vector. Unlike with statsmodels and sklearn, .fit() is effectively called at class instantiation.
Related
For the backpropagation in PyTorch, many gradients of simple, functions are of course already implemented.
But what if I want to have a function that evaluate the gradient of an existing primitive function directly, e.g. the derivative of torch.sigmoid(x) with respect to x? I'd also like to be able to backpropagate through this new function.
The goal would be something like the following, but by using only torch.sigmoid instead of a custom (re-)implementation.
import torch
import matplotlib.pyplot as plt
def dsigmoid_dx(x):
return torch.sigmoid(x) * (1-torch.sigmoid(x))
xx = torch.linspace(-3.5, 3.5, 100)
yy = dsigmoid_dx(xx)
# ... do other stuff with yy
Of course, I could make x require gradients, pass it through the function, and then use autograd, e.g. as follows:
import torch
import matplotlib.pyplot as plt
xx = torch.linspace(-3.5, 3.5, 100, requires_grad=True)
yy = torch.sigmoid(xx)
grad = torch.autograd.grad(yy, [xx], grad_outputs=torch.ones_like(yy), create_graph=True)[0]
plt.plot(xx.detach(), grad.detach())
plt.plot(xx.detach(), yy.detach(), color='red')
plt.show();
Is it (for individual, primitive functions) possible to somehow directly access the implemented backward function?
In the pytorch docs it's shown how to extend autograd, but I can't figure out how to directly access these functions for existing ones (again, e.g. torch.sigmoid)
To summarize, I want to avoid having to reimplement simple derivatives of functions, which are obviously already implemented in the framework (and presumably in a numerically stable way). Is this possible? Or do I always have to reimplement it myself?
Since the computation of yy only involves one (native) function which is torch.sigmoid, then ultimately calling autograd.grad or similarly yy.backward will result in directly calling the implemented backward function of sigmoid. Which is by the looks of it what you are looking for in the first place. In other words, backpropagating on yy is the exact definition of accessing (ie. calling) for a given point.
So one alternative interface you can use is backward:
xx = torch.linspace(-3.5, 3.5, 100, requires_grad=True)
yy = torch.sigmoid(xx)
yy.sum().backward()
plt.plot(xx.detach(), xx.grad)
plt.plot(xx.detach(), yy.detach(), color='red')
I am looking for help about the implementation of a logit model with statsmodel for binary variables.
Here is my code :
(I am using feature selection methods : MinimumRedundancyMaximumRelevance and RecursiveFeatureElimination available on python)
for i_mrmr in range(4,20):
for i_rfe in range(3,i_mrmr):
regressors_step1 = I am selecting the MRMR features
regressors_step2 = I am selecting features from the previous list with RFE method
for method in ['newton', 'nm', 'bfgs', 'lbfgs', 'powell', 'cg', 'ncg']:
logit_model = Logit(y,X.loc[:,regressors_step2])
try:
result = logit_model.fit(method=method, cov_type='HC1')
print(result.summary)
except:
result = "error"
I am using Logit from statsmodels.discrete.discrete_model.Logit.
The y variable, the target, is a binary.
All explanatory variables, the X, are binary too.
The logit model is "functionning" for the different optimization methods. That is to say, I end up with some summary to print. Nonetheless, various warnings print such as : "Maximum Likelihood optimization failed to converge."
The optimization methods presented in the statsmodel algorithm are the ones from scipy :
‘newton’ for Newton-Raphson, ‘nm’ for Nelder-Mead
‘bfgs’ for Broyden-Fletcher-Goldfarb-Shanno (BFGS)
‘lbfgs’ for limited-memory BFGS with optional box constraints
‘powell’ for modified Powell’s method
‘cg’ for conjugate gradient
‘ncg’ for Newton-conjugate gradient
We can find these methods on scipy.optimize.
Here are my questions :
I did not find anywhere any argument against the use of these optimization methods for a binary set of variables. But, because of the warnings, I am asking myself if it is correct to do so. And then, what is the best method, the one which is the more appropriate in this case ?
Here : Scipy minimize: how to restrict x only to 0 and 1? it is implicitely said that a model of the kind Python MIP (Mixed-Integer Linear Programming) could be better in the binary set of variables case. In the documentation of the MIP package of python it appears that to implement this kind of model I should explicitly give a function to minimize or maximize and also I should express the constraints... (see : https://docs.python-mip.com/en/latest/quickstart.html#creating-models)
Therefore I am wondering if i need to define a logit function as the objective function ? What are the constraints I should express ? Is there any easier way to do ?
I want to do SVM classification (i.e. OneClassSVM) with sklearn.svm.OneClassSVM on physical states that come from a different library (tenpy). I'd define a custom kernel
def overlap(X,Y):
return np.array([[x.overlap(y) for y in Y] for x in X])
where overlap() is a defined function in said library to calculate the overlap between states. When I try to fit with my data
clf = OneClassSVM(kernel=overlap)
clf.fit(states)
where states is a list of such state objects, I get the error
TypeError: float() argument must be a string or a number, not 'MPS'
Is there a way to tell sklearn to ignore this test (w/o editing the source code)?
To my understanding the nature of the data and how it's processed is in principal not essential to the algorithm as long as there is a well-defined kernel for the objects.
I am looking to use floor() method in one of my models. I would like to understand what pytorch does with its gradient propagation since as such floor is a discontinuous method.
If there is no gradient defined, I could override the backward method to define my own gradient as necessary but I would like to understand what the default behavior is and the corresponding source code if possible.
import torch
x = torch.rand(20, requires_grad=True)
y = 20*x
z = y.floor().sum()
z.backward()
x.grad returns zeros.
z has a grad_fn=
So FloorBackward is the gradient method. But there is no reference to the source code of FloorBackward in pytorch repository.
As the floor function is piece wise constant. This means the gradient must be zero almost everywhere.
While the code doesn't say anything about it, I expect that the gradient is set to a constant zero everywhere.
I'm using a certain StatsModels distribution (Azzalini's Skew Student-t) and I'd like to perform a (one-sample) Kolmogorov-Smirnov test with it.
Is it possible to use Scipy's kstest with a StatsModels distribution? Scipy's documentation (rather vaguely) suggests that the cdf argument may be a String or a callable, with no further details or examples about the latter.
On the other hand, the StatsModels' distribution I'm using has many of the methods that Scipy distributions do; thus, I'm supposing there is some way of using it as a callable argument passed to kstest. Am I wrong?
Here is what I have so far. What I'd like to achieve is commented out in the last line:
import statsmodels.sandbox.distributions.extras as azt
import scipy.stats as stats
x = ([-0.2833379 , -3.05224565, 0.13236267, -0.24549146, -1.75106484,
0.95375723, 0.28628686, 0. , -3.82529261, -0.26714159,
1.07142857, 2.56183746, -1.89491817, -0.3414301 , 1.11589663,
-0.74540174, -0.60470106, -1.93307821, 1.56093656, 1.28078818])
# This is how kstest works.
print stats.kstest(x, stats.norm.cdf) #(0.21003262911224113, 0.29814145956367311)
# This is Statsmodels' distribution I'm using. It has a cdf function as well.
ast = azt.ACSkewT_gen()
# This is what I'd want. Executing this will throw a TypeError because ast.cdf
# needs some shape parameters etc.
# print stats.kstest(x, ast.cdf)
Note: I'll happily use two-sample KS test if what I'm expecting is not possible. Just wanted to know if this is possible.
Those functions have been written a long time ago with scipy compatibility in mind. But there were several changes in scipy in the meantime.
kstest has an args keyword for the distribution parameters.
To get the distribution parameters we can try to estimate them by using the fit method of the scipy.stats distributions. However, estimating all parameters prints some warnings and the estimated df parameter is large. If we fix df at specific values we get estimates without warnings that we can use in the call of kstest.
>>> ast.fit(x)
C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\scipy\integrate\quadpack.py:352: IntegrationWarning: The maximum number of subdivisions (50) has been achieved.
If increasing the limit yields no improvement it is advised to analyze
the integrand in order to determine the difficulties. If the position of a
local difficulty can be determined (singularity, discontinuity) one will
probably gain from splitting up the interval and calling the integrator
on the subranges. Perhaps a special-purpose integrator should be used.
warnings.warn(msg, IntegrationWarning)
C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\scipy\integrate\quadpack.py:352: IntegrationWarning: The integral is probably divergent, or slowly convergent.
warnings.warn(msg, IntegrationWarning)
(31834.800527154337, -2.3475921468088172, 1.3720725621594987, 2.2766515091760722)
>>> p = ast.fit(x, f0=100)
>>> print(stats.kstest(x, ast.cdf, args=p))
(0.13897385693057401, 0.83458552699682509)
>>> p = ast.fit(x, f0=5)
>>> print(stats.kstest(x, ast.cdf, args=p))
(0.097960232618178544, 0.990756154198281)
However, the distribution for the Kolmogorov-Smirnov test assumes that the distribution parameters are fixed and not estimated. If we estimate the parameters as above, then the p-value will not be correct since it is not based on the correct distribution.
For some distributions we can use tables for the kstest with estimated mean and scale parameter, e.g. the Lilliefors test kstest_normal in statsmodels. If we have estimated shape parameters, then the distribution of the ks test statistics will depend on the parameters of the model, and we could get the pvalue from bootstrapping.
(I don't remember anything about estimating the parameters of the SkewT distribution and whether maximum likelihood estimation has any specific problems.)