Calculating margins in Statsmodels similar to Stata - python

Recently, I saw a post about obatining Stata-like margins using Statsmodels:
statsmodels get_margeff() for OLS
A user („tmck“) was working on a implementation in Statsmodels.
I tried to comment under the post to inquire whether there are any new developments, however, I could not (obviously, one needs a track record of postings before comments are allowed).
Does anyone know more? Has something already been developed? When will there be a possiblity to calucalte margins with Statsmodels in the same way as with Stata or the ‚margins‘ package in R (https://cran.r-project.org/web/packages/margins/vignettes/Introduction.html).
Best and thanks in advance

Related

Fitting data with coupled ODEs using python package "bumps"

I've found online the toolbox bumps (https://pypi.org/project/bumps/) which looks like a well-rounded and easy to use approach to fit data.
I'm interested to fit data described by two coupled ODEs, but, unfortunately, I haven't found any information regarding this procedure on the docs (https://bumps.readthedocs.io/en/latest/index.html).
Does anyone know how to do it?
Thanks in advance
I've ask to the developer on GitHub and he provided two complete examples.
Here the link: https://github.com/bumps/bumps/issues/26

How to find the probability density function's curve that best fit my distribution with python

I have sorted data with pandas so that I have this dataframe (I work with anaconda, jupyter notebook):
I showed a histogram with the abscissa indexing "écart G-D" and ordinate "probabilité".
I found a topic on stack overflow that deals exactly what I want to do except that it is 7 years old and the code is obsolete! I still tried while correcting some things but it does not work (besides I do not even understand the code) ...
Here is the link of the topic:
Fitting empirical distribution to theoretical ones with Scipy (Python)?
I would like to graphically test the probability density function that best follows the shape of my histogram.
If anyone could enlighten me, it would be great because I'm really in a bind ...
Thank you.
You can fit your data manually by calculating the parameters of a distribution(mean, lambda, etc) and use scipy to generate that distribution. Also, if your main objective is just fit the data to a distribution and then use that distribution later, you can use another software (Stat::Fit) to best fit to your data automatically and plot it on the histogram.
You can use the distfit library in Python. It will determine the best theoretical distribution for your data.

Running CHAID on continuous predictors

Please, did anyone try to run CHAID algorithm on continuous predictors ??
At first, I used SPSS Modeler and it worked fine.
but when I tried it on Python 3.6, it didn't work for me.
Thanks :)
P.S. CHAID package could be found here :
https://github.com/Rambatino/CHAID
I'm the author of that library.
It's usually better to post on the issues tab on the github repo as questions have more visibility there.
Unfortunately, with regards to continuous predictors, they need to be binned first before they can be run using CHAID. We haven't implemented a binning strategy as it's very subjective (SPSS makes a lot of decisions under the hood).

Statsmodels seasonal_decompose - what is naive about it?

Have been working with time series in Python, and using sm.tsa.seasonal_decompose. In the docs they introduce the function like this:
We added a naive seasonal decomposition tool in the same vein as R’s decompose.
Here is a copy of the code from the docs and its output:
import statsmodels.api as sm
dta = sm.datasets.co2.load_pandas().data
# deal with missing values. see issue
dta.co2.interpolate(inplace=True)
res = sm.tsa.seasonal_decompose(dta.co2)
res.plot()
They say it is naive but there is no disclaimer about what is wrong with it. Does anyone know?
I made some (aehm...naive) researches, and, according to the reference, it seems that StatsModels uses the classic moving average method to detect trends and apply seasonal decomposition (you can check more here, specifically about Moving Average and Classical Decomposition).
However, other advanced seasonal decomposition techniques are available, such as STL decomposition, which also has some Python implementations. (UPDATE - 11/04/2019 as pointed out in the comments by #squarespiral, such implementations appear to have been merged in the master branch of StatsModels).
At the above links, you can find a complete reference on the advantages and disadvantages of each one of the proposed methods.
Hope it helps!

How to do 2SLS IV regression using statsmodels python?

I'm trying to do 2 stage least squares regression in python using the statsmodels library:
from statsmodels.sandbox.regression.gmm import IV2SLS
resultIV = IV2SLS(dietdummy['Log Income'],
dietdummy.drop(['Log Income', 'Diabetes']),
dietdummy.drop(['Log Income', 'Reads Nutri')
Reads Nutri is my endogenous variable my instrument is Diabetes and my dependent variable is Log Income.
Did I do this right? It is much different than the way I would do it on Stata.
Also, when I do resultIV.summary(), I get a TypeError (something to do with the F statistic being nonetype). How can I resolve this?
I found this question when I wanted to do an IV2SLS regression myself and had the same problem. So, just for everybody else who landed here.
The documentation of statsmodels shows how to use this command. Your arguments are endog, exog, and instrumentin that order where exog includes variables which are instrumented and instrument the instruments and other control variables. In that sense, your model is fine.
The TypeError you found is currently an open bug in versions 0.6.0 and 0.8.1. and will be fixed in 0.9.0 according to the milestone.
Update (28.06.2018): Version 9.0.0 was released on 15 May and should include a fix for the aforementioned bug.
Personally, I found the IV2SLS function in linearmodels 4.5 to be more intuitive than the statsmodels version, as it has separate parameters for the dependent variable and the endogenous variable(s), whereas the statsmodels version doesn't. The results I got from the linearmodels function lined up with what I would get with an Excel add-in I got through school.
If you choose to use the linearmodels function, this guide should also help. For instance, it showed me that I needed to add in a constant for my function to produce the correct output.

Categories

Resources