Model Equation of logistic regression in sklearn or statsmodel.api - python

After running the logistic regression, I want to see the model equation. Is there a way of finding that?

In sklearn you can access the coefficients and intercept by accessing the coef_ and intercept_ attributes as documented here.
Don't know about statsmodels, but according to the first tutorial that shows up on google here, you can use the params attribute of the result.

Related

Explain why Lasso regression intercept coming different when we include intercept in data vs model fitting intercept by itself?

So I am trying to fit a Lasso Regression model with Polynomial Features.
To create these Polynomial Features, I am using from sklearn.preprocessing import PolynomialFeatures
It has a parameter inclue_bias which can be set to True or False as PolynomialFeatures(degree=5, include_bias=False). This bias is nothing but the column of 1 that represents the intercept in the final equation. As explained in detail here in another answer.
The issue is arising, when I either set include_bias=True and then fit Lasso regression without an intercept as it has already been taken care of or I choose to not include bias and set fit_intercept=True in from sklearn.linear_model import Lasso. They technically should give result for the intercept coefficient right? But turns out they aren't coming out to be same.
Is there internally some bug in scikit learn or what am I missing here? Let me know if anyone wants to try this on their own with some data. I'll share some.
When PolynomialFeatures.include_bias=True but Lasso.fit_intercept=False, the lasso model thinks the all-ones column is just another feature, and so the "intercept term" is penalized with the L1 penalty. When PolynomialFeatures.include_bias=False and Lasso.fit_intercept=True, the lasso model knows not to penalize the intercept that it is fitting.

How can I use logistic regression in sklearn for continuous but bounded dependent variable?

How can I use logistic regression in sklearn for continiuos but bounded (0<=y<=1) dependent variable? If it's not possible in sklearn, with what library can I do it?
It completly depends on your distribution of your problem.
This two pictures are explainining the difference between linear and logistic regression, there are also other regression types (e.g. polynomial regression), depending on your data points (here in red), you need to search for the right approach.
Here is the overview from scikit: https://scikit-learn.org/stable/supervised_learning.html#supervised-learning
See the discussion here: https://scikit-learn-general.narkive.com/4dSCktaM/using-logistic-regression-on-a-continuous-target-variable
There are two suggestions:
Stop doing logistic regression on something that is not a binary target
Use statsmodels https://www.statsmodels.org

Logistic regression multiclass classification with Python API

currently the Python API does not yet support multi class classification within Spark, but will in the future as it is described on the Spark page 1.
Is there any release date or any chance to run it with Python that implements multi class with Logistic regression? I know it does with Scala, but I would like to run it with Python. Thank you.
scikit-learn's LogisticRegression offers a multi_class parameter. From the docs:
Multiclass option can be either ‘ovr’ or ‘multinomial’. If the option
chosen is ‘ovr’, then a binary problem is fit for each label. Else the
loss minimised is the multinomial loss fit across the entire
probability distribution. Works only for the ‘lbfgs’ solver.
Hence, multi_class='ovr' seems to be the right choice for you.
For more information: see this link
Added:
As per the pyspark documentation, you can still do multi class regression using their API. Using the class pyspark.mllib.classification.LogisticRegressionWithLBFGS, you get the optional parameter numClasses for multi-class classification.

sklearn svm non integer outputs

i've been trying to find this information around and couldnt found any help.
What i want to do is get a float number as output from sklearn svm in order to work as input for a sub classifier.
Is it possible to get output from svm like 0,89898 instead of 1, given that a class is more closely to be classified as 1?
Thank you
Platt scaling can help to achieve what you want. It fits a logistic sigmoid curve on top of the output of SVM in a post-hoc fashion.
To do this in sklearn, you'll need to fit your SVM with probability parameter set to True. Then, you can use the fitted model's predict_proba() method to get a floating point output. More documentations can be found here. You'll also find related discussions in this thread.

Weighted logistic regression in Python

I'm looking for a good implementation for logistic regression (not regularized) in Python. I'm looking for a package that can also get weights for each vector. Can anyone suggest a good implementation / package?
Thanks!
I notice that this question is quite old now but hopefully this can help someone. With sklearn, you can use the SGDClassifier class to create a logistic regression model by simply passing in 'log' as the loss:
sklearn.linear_model.SGDClassifier(loss='log', ...).
This class implements weighted samples in the fit() function:
classifier.fit(X, Y, sample_weight=weights)
where weights is a an array containing the sample weights that must be (obviously) the same length as the number of data points in X.
See http://scikit-learn.org/dev/modules/generated/sklearn.linear_model.SGDClassifier.html for full documentation.
The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(class_weight='balanced')
model = model.fit(X, y)
EDIT
Sample Weights can be added in the fit method. You just have to pass an array of n_samples. Check out documentation -
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.fit
Hope this does it...
I think what you want is statsmodels. It has great support for GLM and other linear methods. If you're coming from R, you'll find the syntax very familiar.
statsmodels weighted regression
getting started w/ statsmodels
Have a look at scikits.learn logistic regression implementation
Do you know Numpy? If no, take a look also to Scipy and matplotlib.

Categories

Resources