I saw multiple simmilar questions about the following error in the sklearn:
'AttributeError: LinearRegression object has no attribute...'
I couldn't find any hint about my problem tough:
AttributeError: LinearRegression object has no attribute 'model'
I tried to do a multilinear regression y ~ x with the following code:
import statsmodels.api as sma
from sklearn import linear_model
#https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
#perform linear regression
df_x = df.drop('Migration distance',1) #for simplicity I did't use any testing data just this 2
df_y = df['Migration distance']
reg = linear_model.LinearRegression().fit(df_x, df_y)
reg_score=reg.score(df_x, df_y)
print('R2 score:',reg_score)
#plot the residuals
fig = plt.figure(figsize=(12,8))
fig = sma.graphics.plot_regress_exog(reg, 'Migration distance', fig=fig)
but this error occurs every time I try to plot the residuals:
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_12904/573591118.py in <module>
10
11 fig = plt.figure(figsize=(12,8))
---> 12 fig = sma.graphics.plot_regress_exog(reg, 'Migration distance', fig=fig)
C:\ProgramData\Anaconda3\lib\site-packages\statsmodels\graphics\regressionplots.py in plot_regress_exog(results, exog_idx, fig)
218 fig = utils.create_mpl_fig(fig)
219
--> 220 exog_name, exog_idx = utils.maybe_name_or_idx(exog_idx, results.model)
221 results = maybe_unwrap_results(results)
222
AttributeError: 'LinearRegression' object has no attribute 'model'
I think my linear regression works because I can compute the R2 score but I have no clue how to overcome this error in order to plot the residuals.
As the documentation for graphics.plot_regress_exog implies, the model passed in the results argument (i.e. your reg here) must be
A result instance with resid, model.endog and model.exog as attributes.
i.e. a statsmodels model, and not a scikit-learn one, as your LinearRegression here. In other words, the function cannot work with scikit-learn models.
Since you are actually doing a simple OLS regression, if you really need the functionality, I would suggest using the respective statsmodels model instead of the scikit-learn one.
Related
I'm pretty new to python so I've adapted code that I've found in online resources to try to create this regression. The code that I'm pulling from is working perfectly and I've barely changed anything but the data sources, so I'm not sure what I'm doing wrong. Any help would be incredible!
Here's the code I'm using:
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score
dfgmatgpa = df[["GradGPA", "GMATscore"]]
dfgmatgpa = dfgmatgpa.dropna()
dfgmatgpa.head()
GradGPA GMATscore
17 2.80000 340.0
18 2.80000 340.0
32 4.15000 660.0
36 3.88143 570.0
41 3.28571 540.0
# Load the diabetes dataset
gmatgpa_X, gmatgpa_y = dfgmatgpa(return_X_y=True)
# Use only one feature
gmatgpa_X = gmatgpa_X[:, np.newaxis, 2]
# Split the data into training/testing sets
gmatgpa_X_train = gmatgpa_X[:-20]
gmatgpa_X_test = gmatgpa_X[-20:]
# Split the targets into training/testing sets
gmatgpa_y_train = gmatgpa_y[:-20]
gmatgpa_y_test = gmatgpa_y[-20:]
# Create linear regression object
regr = linear_model.LinearRegression()
# Train the model using the training sets
regr.fit(gmatgpa_X_train, gmatgpa_y_train)
# Make predictions using the testing set
gmatgpa_y_pred = regr.predict(gmatgpa_X_test)
# The coefficients
print('Coefficients: \n', regr.coef_)
# The mean squared error
print('Mean squared error: %.2f'
% mean_squared_error(gmatgpa_y_test, gmatgpa_y_pred))
# The coefficient of determination: 1 is perfect prediction
print('Coefficient of determination: %.2f'
% r2_score(gmatgpa_y_test, gmatgpa_y_pred))
# Plot outputs
plt.scatter(gmatgpa_X_test, gmatgpa_y_test, color='black')
plt.plot(gmatgpa_X_test, gmatgpa_y_pred, color='blue', linewidth=3)
plt.xticks(())
plt.yticks(())
plt.show()
error:
TypeError Traceback (most recent call last)
<ipython-input-321-b5f145507243> in <module>
----> 1 gmatgpa_X, gmatgpa_y = dfgmatgpa(return_X_y=True)
2
3 # Use only one feature
4 gmatgpa_X = gmatgpa_X[:, np.newaxis, 2]
5
TypeError: 'DataFrame' object is not callable
Perhaps you are referring to some sample code like this:
from sklearn.datasets import load_iris
data = load_iris(return_X_y=True)
Here the load_iris() is a function, and when return_X_y is True, it returns a (data, target) tuple.
https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html
In your case you are defining dfgmatgpa as a dataframe, not a function, that's why you got the error. But you can define X and y separately as the way you needed: gmatgpa_X as the dataframe, gmatgpa_y as the target list.
i am new to python and deep learning, i trained a multi classifier model and want to plot a confusion matrix but i am facing an error
here is my code
from sklearn.metrics import plot_confusion_matrix
import matplotlib.pyplot as plt
from sklearn.metrics import ConfusionMatrixDisplay
Y_pred = model.predict_generator(test_generator)
y_pred = np.argmax(Y_pred, axis=1)
category_names = sorted(os.listdir('D:/DiabaticRetinopathy/mq_dataset/DR_Normal/train'))
print(category_names)
cm = confusion_matrix(test_generator.classes, y_pred)
plot_confusion_matrix(cm, classes = category_names, title='Confusion Matrix', normalize=False, figname = 'Confusion_matrix_concrete.jpg')
i upddated my sklearn to 0.24 version. i restarted my kernel after updating but still its giving an error:
TypeError: plot_confusion_matrix() got an unexpected keyword argument 'classes'
use labels instead of classes, then Remove title, figname
plot_confusion_matrix(X = test_generator.classes, y_true = y_pred,labels= category_names, normalize=False)
Documentation: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.plot_confusion_matrix.html
There is a keyword labels, but not classes, so you can change it to that.
The error states that the keyword classes you provided is not a keyword that this function recognizes. This happens in your last line.
The documentation gives a list of the keywords that you can use:
doc
Example from https://runawayhorse001.github.io/LearningApacheSpark/clustering.html
caused strange error while I decided to test the clustering example for Spark.
Example:
from sklearn.cluster import KMeans
import numpy as np
cost = np.zeros(20)
for k in range(2,20):
kmeans = KMeans()\
.setK(k)\
.setSeed(1) \
.setFeaturesCol("indexedFeatures")\
.setPredictionCol("cluster")
model = kmeans.fit(data)
cost[k] = model.computeCost(data)
And it caused Error in Kmeans attributes despite of fit already implemented.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-22-296a7d54514a> in <module>
2 cost = np.zeros(20)
3 for k in range(2,20):
----> 4 kmeans = KMeans()\
5 .setK(k)\
6 .setSeed(1) \
AttributeError: 'KMeans' object has no attribute 'setK'
I had similar issues in the past and .fit() solved them, but now it is not working.
You're importing the wrong KMeans. I believe that KMeans refer to the one in Spark ML, not in scikit-learn.
from pyspark.ml.clustering import KMeans
I am implementing simple linear regression and multiple linear regression using pandas and sklearn
My code is as follows
import pandas as pd
import numpy as np
import scipy.stats
from sklearn import linear_model
from sklearn.metrics import r2_score
df = pd.read_csv("Auto.csv", na_values='?').dropna()
lr = linear_model.LinearRegression()
y = df['mpg']
x = df['displacement']
X = x.values.reshape(-1,1)
sklearn_model = lr.fit(X,y)
This works fine, but for multiple linear regression, for some reason it doesn't work WITH the () at the end of sklearn's linear regression, when I use it with the brackets I get the following error:
TypeError: 'LinearRegression' object is not callable
My multiple linear regression code is as follows:
lr = linear_model.LinearRegression
feature_1 = np.array(df[['displacement']])
feature_2 = np.array(df[['weight']])
feature_1 = feature_1.reshape(len(feature_1),1)
feature_2 = feature_2.reshape(len(feature_2),1)
X = np.hstack([feature_1,feature_2])
sklearn_mlr = lr(X,df['mpg'])
I want to know what I'm doing wrong. Additionally, I'm not able to print the various attributes in the linear regression method if I don't use the () at the end. e.g.
print(sklearn_mlr.coef_)
Gives me the error:
AttributeError: 'LinearRegression' object has no attribute 'coef_'
Given this snippet:
lr = linear_model.LinearRegression
feature_1 = np.array(df[['displacement']])
feature_2 = np.array(df[['weight']])
feature_1 = feature_1.reshape(len(feature_1),1)
feature_2 = feature_2.reshape(len(feature_2),1)
X = np.hstack([feature_1,feature_2])
sklearn_mlr = lr(X,df['mpg'])
Your issue is that you have not initialized an instance of the LinearRegression class. You need to initialize it like you did in the first example. Then you can use the fit method like so:
lr = linear_model.LinearRegression()
feature_1 = np.array(df[['displacement']])
feature_2 = np.array(df[['weight']])
feature_1 = feature_1.reshape(len(feature_1),1)
feature_2 = feature_2.reshape(len(feature_2),1)
X = np.hstack([feature_1,feature_2])
sklearn_mlr = lr.fit(X,df['mpg'])
Once an instance has been fit it will have the attributes listed in the documentation (e.g. .coef_). As it was you were trying to access .coef of the LogisticRegression class itself.
lr is a class in your example.
You need to initialize it, and then call .fit(X,df['mpg']) from the instance.
Why not import it as follows:
from sklearn.linear_model import LinearRegression
In my opinion it is much cleaner than what you did. You can then use it like that:
lr = LinearRegression()
I've been attempting to fit this data by a Linear Regression, following a tutorial on bigdataexaminer. Everything was working fine up until this point. I imported LinearRegression from sklearn, and printed the number of coefficients just fine. This was the code before I attempted to grab the coefficients from the console.
import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib.pyplot as plt
import sklearn
from sklearn.datasets import load_boston
from sklearn.linear_model import LinearRegression
boston = load_boston()
bos = pd.DataFrame(boston.data)
bos.columns = boston.feature_names
bos['PRICE'] = boston.target
X = bos.drop('PRICE', axis = 1)
lm = LinearRegression()
After I had all this set up I ran the following command, and it returned the proper output:
In [68]: print('Number of coefficients:', len(lm.coef_)
Number of coefficients: 13
However, now if I ever try to print this same line again, or use 'lm.coef_', it tells me coef_ isn't an attribute of LinearRegression, right after I JUST used it successfully, and I didn't touch any of the code before I tried it again.
In [70]: print('Number of coefficients:', len(lm.coef_))
Traceback (most recent call last):
File "<ipython-input-70-5ad192630df3>", line 1, in <module>
print('Number of coefficients:', len(lm.coef_))
AttributeError: 'LinearRegression' object has no attribute 'coef_'
The coef_ attribute is created when the fit() method is called. Before that, it will be undefined:
>>> import numpy as np
>>> import pandas as pd
>>> from sklearn.datasets import load_boston
>>> from sklearn.linear_model import LinearRegression
>>> boston = load_boston()
>>> lm = LinearRegression()
>>> lm.coef_
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-22-975676802622> in <module>()
7
8 lm = LinearRegression()
----> 9 lm.coef_
AttributeError: 'LinearRegression' object has no attribute 'coef_'
If we call fit(), the coefficients will be defined:
>>> lm.fit(boston.data, boston.target)
>>> lm.coef_
array([ -1.07170557e-01, 4.63952195e-02, 2.08602395e-02,
2.68856140e+00, -1.77957587e+01, 3.80475246e+00,
7.51061703e-04, -1.47575880e+00, 3.05655038e-01,
-1.23293463e-02, -9.53463555e-01, 9.39251272e-03,
-5.25466633e-01])
My guess is that somehow you forgot to call fit() when you ran the problematic line.
I also got the same problem while dealing with linear regression the problem object has no attribute 'coef'.
There are just slight changes in the syntax only.
linreg = LinearRegression()
linreg.fit(X,y) # fit the linesr model to the data
print(linreg.intercept_)
print(linreg.coef_)
I Hope this will help you Thanks