AttributeError: LinearRegression object has no attribute 'coef_' - python

I've been attempting to fit this data by a Linear Regression, following a tutorial on bigdataexaminer. Everything was working fine up until this point. I imported LinearRegression from sklearn, and printed the number of coefficients just fine. This was the code before I attempted to grab the coefficients from the console.
import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib.pyplot as plt
import sklearn
from sklearn.datasets import load_boston
from sklearn.linear_model import LinearRegression
boston = load_boston()
bos = pd.DataFrame(boston.data)
bos.columns = boston.feature_names
bos['PRICE'] = boston.target
X = bos.drop('PRICE', axis = 1)
lm = LinearRegression()
After I had all this set up I ran the following command, and it returned the proper output:
In [68]: print('Number of coefficients:', len(lm.coef_)
Number of coefficients: 13
However, now if I ever try to print this same line again, or use 'lm.coef_', it tells me coef_ isn't an attribute of LinearRegression, right after I JUST used it successfully, and I didn't touch any of the code before I tried it again.
In [70]: print('Number of coefficients:', len(lm.coef_))
Traceback (most recent call last):
File "<ipython-input-70-5ad192630df3>", line 1, in <module>
print('Number of coefficients:', len(lm.coef_))
AttributeError: 'LinearRegression' object has no attribute 'coef_'

The coef_ attribute is created when the fit() method is called. Before that, it will be undefined:
>>> import numpy as np
>>> import pandas as pd
>>> from sklearn.datasets import load_boston
>>> from sklearn.linear_model import LinearRegression
>>> boston = load_boston()
>>> lm = LinearRegression()
>>> lm.coef_
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-22-975676802622> in <module>()
7
8 lm = LinearRegression()
----> 9 lm.coef_
AttributeError: 'LinearRegression' object has no attribute 'coef_'
If we call fit(), the coefficients will be defined:
>>> lm.fit(boston.data, boston.target)
>>> lm.coef_
array([ -1.07170557e-01, 4.63952195e-02, 2.08602395e-02,
2.68856140e+00, -1.77957587e+01, 3.80475246e+00,
7.51061703e-04, -1.47575880e+00, 3.05655038e-01,
-1.23293463e-02, -9.53463555e-01, 9.39251272e-03,
-5.25466633e-01])
My guess is that somehow you forgot to call fit() when you ran the problematic line.

I also got the same problem while dealing with linear regression the problem object has no attribute 'coef'.
There are just slight changes in the syntax only.
linreg = LinearRegression()
linreg.fit(X,y) # fit the linesr model to the data
print(linreg.intercept_)
print(linreg.coef_)
I Hope this will help you Thanks

Related

Object not found while trying to write a pickle file

I am trying to do cancer detection using Random Vector Forest. I am trying to make a pickle file by using the command pickle.dump(forest,open("model.pkl","wb") .But I am getting a name error
NameError Traceback (most recent call last)
c:\Users\hp\newtest\pcancer.ipynb Cell 6' in <cell line: 1>()
----> 1 pickle.dump(forest,open("model.pkl","wb"))
NameError: name 'forest' is not defined
This is my source code for detection:
import numpy as np
import pandas as pd
import warnings as wr
#Ignoring warnings
from sklearn.exceptions import UndefinedMetricWarning
wr.filterwarnings("ignore", category=UndefinedMetricWarning)
import pickle
df=pd.read_csv('Prostate_cancer_data.csv')
print(df.head(10))
print(df.shape)
print(df.isna().sum())
df=df.dropna(axis=1)#Drop the column with empty data
df=df.drop(['id'],axis=1)
#Encoding first column
from sklearn.preprocessing import LabelEncoder
labelencoder_X=LabelEncoder()
df.iloc[:,0]=labelencoder_X.fit_transform(df.iloc[:,0].values)
#Splitting data for dependence
X=df.iloc[:,1:].values
Y=df.iloc[:,0].values
#Train-Test split
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.25,random_state=1)
#Standard scaling
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
X_train=sc.fit_transform(X_train)
X_test=sc.fit_transform(X_test)
from sklearn.ensemble import RandomForestClassifier
def models(X_train,Y_train):
#Random forest classifier
forest=RandomForestClassifier(n_estimators=10,criterion='entropy',random_state=0)
forest.fit(X_train,Y_train)
print("Random Forest:",forest.score(X_train,Y_train))
return forest
print("Accuracy")
model=models(X_train,Y_train)
model=models(X_train,Y_train)
this part of code is not indented in order. so its a local declaration and action as a recursive call
There is indentation problem in the last section of your code. This is correctly indented code and when you create a pickle file you'll write model object in it not the forest as forest is returned in object named model
from sklearn.ensemble import RandomForestClassifier
def models(X_train,Y_train):
#Random forest classifier
forest=RandomForestClassifier(n_estimators=10,criterion='entropy',random_state=0)
forest.fit(X_train,Y_train)
print("Random Forest:",forest.score(X_train,Y_train))
return forest
print("Accuracy")
model=models(X_train,Y_train)
pickle.dump(model,open("model.pkl","wb"))

How to slice a XGBClassifier/XGBRegressor model into sub-models?

This document shows that a XGBoost API trained model can be sliced by following code:
from sklearn.datasets import make_classification
import xgboost as xgb
booster = xgb.train({
'num_parallel_tree': 4, 'subsample': 0.5, 'num_class': 3},
num_boost_round=num_boost_round, dtrain=dtrain)
sliced: xgb.Booster = booster[3:7]
I tried it and it worked.
Since XGBoost provides Scikit-Learn Wrapper interface, I tried something like this:
from xgboost import XGBClassifier
clf_xgb = XGBClassifier().fit(X_train, y_train)
clf_xgb_sliced: clf_xgb.Booster = booster[3:7]
But got following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-18-84155815d877> in <module>
----> 1 clf_xgb_sliced: clf_xgb.Booster = booster[3:7]
AttributeError: 'XGBClassifier' object has no attribute 'Booster'
Since XGBClassifier has no attribute 'Booster', is there any way to slice a Scikit-Learn Wrapper interface trained XGBClassifier(/XGBRegressor) model?
The problem is with the type hint you are giving clf_xgb.Booster which does not match an existing argument. Try:
clf_xgb_sliced: xgb.Booster = clf_xgb.get_booster()[3:7]
instead.

AttributeError: 'KMeans' object has no attribute 'setK'

Example from https://runawayhorse001.github.io/LearningApacheSpark/clustering.html
caused strange error while I decided to test the clustering example for Spark.
Example:
from sklearn.cluster import KMeans
import numpy as np
cost = np.zeros(20)
for k in range(2,20):
kmeans = KMeans()\
.setK(k)\
.setSeed(1) \
.setFeaturesCol("indexedFeatures")\
.setPredictionCol("cluster")
model = kmeans.fit(data)
cost[k] = model.computeCost(data)
And it caused Error in Kmeans attributes despite of fit already implemented.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-22-296a7d54514a> in <module>
2 cost = np.zeros(20)
3 for k in range(2,20):
----> 4 kmeans = KMeans()\
5 .setK(k)\
6 .setSeed(1) \
AttributeError: 'KMeans' object has no attribute 'setK'
I had similar issues in the past and .fit() solved them, but now it is not working.
You're importing the wrong KMeans. I believe that KMeans refer to the one in Spark ML, not in scikit-learn.
from pyspark.ml.clustering import KMeans

Multivariete Regression Error "AttributeError: 'numpy.ndarray' object has no attribute 'columns'"

I'm trying to run a multivariate linear regression but I'm getting an error when trying to get the coefficients of the regression model.
The error I'm getting is this:
AttributeError: 'numpy.ndarray' object has no attribute 'columns'
Here's the code I'm using:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as seabornInstance
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
%matplotlib inline
# Main files
dataset = pd.read_csv('namaste_econ_model.csv')
dataset.shape
dataset.describe()
dataset.isnull().any()
#Dividing data into "attributes" and "labels". X variable contains all the attributes and y variable contains labels.
X = dataset[['Read?', 'x1', 'x2', 'x3', 'x4', 'x5', 'x6' , 'x7','x8','x9','x10','x11','x12','x13','x14','x15','x16','x17','x18','x19','x20','x21','x22','x23','x24','x25','x26','x27','x28','x29','x30','x31','x32','x33','x34','x35','x36','x37','x38','x39','x40','x41','x42','x43','x44','x45','x46','x47']].values
y = dataset['Change in Profit (BP)'].values
plt.figure(figsize=(15,10))
plt.tight_layout()
seabornInstance.distplot(dataset['Change in Profit (BP)'])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
regressor = LinearRegression()
regressor.fit(X_train, y_train)
coeff_df = pd.DataFrame(regressor.coef_, X.columns, columns=['Coefficient'])
coeff_df
Full error:
Traceback (most recent call last):
File "<ipython-input-67-773b9f78bc01>", line 14, in <module>
coeff_df = pd.DataFrame(regressor.coef_, X.columns, columns=['Coefficient'])
AttributeError: 'numpy.ndarray' object has no attribute 'columns'
Any help on this will be highly appreciated!
I did the exact same thing, removing the .values from the X variable on line 18 worked for me!

Turi Create Error : 'module' object not callable

I am trying to implement nearest neighbor classifier in Turi Create, however I am unsure of this error I am getting. This error occurs when I create the actual model. I am using python 3.6 if that helps.
Error:
Traceback (most recent call last):
File "/Users/PycharmProjects/turi/turi.py", line 51, in <module>
iris_cross()
File "/Users/PycharmProjects/turi/turi.py", line 37, in iris_cross
clf = tc.nearest_neighbor_classifier(train_data, target='4', features=features)
TypeError: 'module' object is not callable
Code:
import turicreate as tc
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn import datasets
import time
import numpy as np
#Iris Classification Cross Validation
def iris_cross():
iris = datasets.load_iris()
features = ['0','1','2','3']
target = iris.target_names
x = iris.data
y = iris.target.astype(int)
undata = np.column_stack((x,y))
data = tc.SFrame(pd.DataFrame(undata))
print(data)
train_data, test_data = data.random_split(.8)
clf = tc.nearest_neighbor_classifier(train_data, target='4', features=features)
print('done')
iris_cross()
You have to actually call the create() method of the nearest_neighbor_classifier. See the library API.
Run the following line of code instead:
clf = tc.nearest_neighbor_classifier.create(train_data, target='4', features=features)

Categories

Resources