'DataFrame' object has no attribute 'to_frame' - python

I am new to python. Just following the tutorial: https://www.hackerearth.com/practice/machine-learning/machine-learning-projects/python-project/tutorial/
This is the dataframe miss:
miss = train.isnull().sum()/len(train)
miss = miss[miss>0]
miss.sort_values(inplace = True)
miss
Electrical 0.000685
MasVnrType 0.005479
MasVnrArea 0.005479
BsmtQual 0.025342
BsmtCond 0.025342
BsmtFinType1 0.025342
BsmtExposure 0.026027
BsmtFinType2 0.026027
GarageCond 0.055479
GarageQual 0.055479
GarageFinish 0.055479
GarageType 0.055479
GarageYrBlt 0.055479
LotFrontage 0.177397
FireplaceQu 0.472603
Fence 0.807534
Alley 0.937671
MiscFeature 0.963014
PoolQC 0.995205
dtype: float64
Now I just want to visualize those missing values"
#visualising missing values
miss = miss.to_frame()
miss.columns = ['count']
miss.index.names = ['Name']
miss['Name'] = miss.index
And this is the error I got:
AttributeError Traceback (most recent call last)
<ipython-input-42-cd3b25e8862a> in <module>()
1 #visualising missing values
----> 2 miss = miss.to_frame()
C:\Users\Username\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
2742 if name in self._info_axis:
2743 return self[name]
-> 2744 return object.__getattribute__(self, name)
2745
2746 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'to_frame'
What am I missing here?

Check print(type(miss)) it should be <class 'pandas.core.series.Series'>
You have is dataframe, somewhere in the code you are doing wrong.
df = pd.DataFrame()
df.to_frame()
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Users\UR_NAME\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\generic.py", line 3614, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'to_frame'
I traced the tutorial, and below is the order flow
train = pd.read_csv("train.csv")
print(type(train)) # <class 'pandas.core.frame.DataFrame'>
miss = train.isnull().sum()/len(train)
print(type(miss)) # <class 'pandas.core.series.Series'>
miss = train.isnull().sum()/len(train) converts in into pandas.core.series.Series from pandas.core.frame.DataFrame
You are probably messed code at this place.

If you use Notebook while the current cell is running, "miss" is converted to a data frame so that the output is displayed the first time. If you run the cell again, you will get an/the error because it is already a data frame. So run the previous cell again and then run the current cell to fix the problem. The notebook itself works this way.

Related

AttributeError: 'Series' object has no attribute 'pivot_table'

ISL_eventPassdf[ISL_eventPassdf["match_id"].isin([3817897, 3813305])]["match_id"].drop_duplicates()
Series([], Name: match_Id, dtype: int64)\
ISL_FINAL_Data =ISL_eventPassdf[ISL_eventPassdf["match_id"].isin([3817897, 3813305])]["match_id"]
ISL_FINAL_Data.pivot_table(values="type.id", index="player.name", columns="pass.recipient.name", aggfunc="count")
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_11460/1068599328.py in
----> 1 ISL_FINAL_Data.pivot_table(values="type.id", index="player.name", columns="pass.recipient.name", aggfunc="count")
C:\Python\Python310\lib\site-packages\pandas\core\generic.py in getattr(self, name)
5905 ):
5906 return self[name]
-> 5907 return object.getattribute(self, name)
5908
5909 def setattr(self, name: str, value) -> None:
AttributeError: 'Series' object has no attribute 'pivot_table'
please help me to fix this
error shows 'Series' object has no attribute 'pivot_table'

AttributeError: 'DataFrame' object has no attribute 'raw_ratings'

I am getting an error while using the following command
trainset, testset = train_test_split(t2data, test_size=.15,train_size=0.85)
The dataset contains user rating, user ids and product ids.
error message:
AttributeError: 'DataFrame' object has no attribute 'raw_ratings'
My dataframe doesn't have any attribute by the name raw_ratings.
This is how I am reading the CSV:
rdata = pd.read_csv('ratings_Electronics.csv', header=0, names ['userid','productid','rating','timestamp'],skipinitialspace=True)
So i am unable to understand how this error is coming. Any help would be appreciated. thanks
detailed error:
AttributeError Traceback (most recent call last)
in ()
----> 1 trainset, testset = train_test_split(t2data, test_size=.15,train_size=0.85)
2 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in getattr(self, name)
5134 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5135 return self[name]
-> 5136 return object.getattribute(self, name)
5137
5138 def setattr(self, name: str, value) -> None:
AttributeError: 'DataFrame' object has no attribute 'raw_ratings'
You may be using the wrong data type. Very much possible you are using panada data frame whereas surprise dataset is expected.
I found this example helpful https://github.com/NicolasHug/Surprise/issues/20
from NicholasHug.
Solution worked for me.
You are reading the CSV in rdata variable and splitting the t2data.

.idmin() and .idmax() in a Series not working

I am learning about python/pandas attributes in a Series. I can get it to display the min and max values, but I want to display the min and max index values and I get an error message.
google.min()
49.95
google.max()
782.22
google.idmin()
AttributeError Traceback (most recent
call last) in
----> 1 google.idmin(True)
/opt/anaconda3/envs/pandas_playground/lib/python3.8/site-packages/pandas/core/generic.py
in getattr(self, name) 5272 if
self._info_axis._can_hold_identifiers_and_holds_name(name): 5273
return self[name]
-> 5274 return object.getattribute(self, name) 5275 5276 def setattr(self, name: str, value) -> None:
AttributeError: 'Series' object has no attribute 'idmin'
After some searching, I found I was simply using the wrong methods.
idxmin and idxmax work just fine.
google.idxmax()
3011
google.idxmin()
11

Cannot get a SIMPLE HISTOGRAM to plot in Python 3.7 Notebook

Below is the error code I received as I am trying to do a Histogram from the DF "code" and the column ("Age")
code['Age'].plt.hist()
1
code['Age'].plt.hist()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-44-2117f9d17105> in <module>
----> 1 code['Age'].plt.hist()
~\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
5177 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5178 return self[name]
-> 5179 return object.__getattribute__(self, name)
5180
5181 def __setattr__(self, name, value):
AttributeError: 'Series' object has no attribute 'plt'
Use the hist function directly from matplotlib:
import matplotlib.pyplot as plt
plt.hist(code['Age'])
plt.show()
This should work. You can also do:
import matplotlib.pyplot as plt
code['Age'].hist()
plt.show()

XGBoost: AttributeError: 'DataFrame' object has no attribute 'feature_names'

I've trained an XGBoost Classifier for binary classification. While training the model on train data using CV and predicting on the test data, I face the error AttributeError: 'DataFrame' object has no attribute 'feature_names'.
My code is as follows:
folds = StratifiedKFold(n_splits=5, shuffle=False, random_state=44000)
oof = np.zeros(len(X_train))
predictions = np.zeros(len(X_test))
for fold_, (trn_idx, val_idx) in enumerate(folds.split(X_train, y_train)):
print("Fold {}".format(fold_+1))
trn_data = xgb.DMatrix(X_train.iloc[trn_idx], y_train.iloc[trn_idx])
val_data = xgb.DMatrix(X_train.iloc[val_idx], y_train.iloc[val_idx])
clf = xgb.train(params = best_params,
dtrain = trn_data,
num_boost_round = 2000,
evals = [(trn_data, 'train'), (val_data, 'valid')],
maximize = False,
early_stopping_rounds = 100,
verbose_eval=100)
oof[val_idx] = clf.predict(X_train.iloc[val_idx], ntree_limit=clf.best_ntree_limit)
predictions += clf.predict(X_test, ntree_limit=clf.best_ntree_limit)/folds.n_splits
How to deal with it?
Here is the complete error trace:
Fold 1
[0] train-auc:0.919667 valid-auc:0.822968
Multiple eval metrics have been passed: 'valid-auc' will be used for early stopping.
Will train until valid-auc hasn't improved in 100 rounds.
[100] train-auc:1 valid-auc:0.974659
[200] train-auc:1 valid-auc:0.97668
[300] train-auc:1 valid-auc:0.977696
[400] train-auc:1 valid-auc:0.977704
Stopping. Best iteration:
[376] train-auc:1 valid-auc:0.977862
Exception ignored in: <bound method DMatrix.__del__ of <xgboost.core.DMatrix object at 0x7f3d9c285550>>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/xgboost/core.py", line 368, in __del__
if self.handle is not None:
AttributeError: 'DMatrix' object has no attribute 'handle'
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-55-d52b20cc0183> in <module>()
19 verbose_eval=100)
20
---> 21 oof[val_idx] = clf.predict(X_train.iloc[val_idx], ntree_limit=clf.best_ntree_limit)
22
23 predictions += clf.predict(X_test, ntree_limit=clf.best_ntree_limit)/folds.n_splits
/usr/local/lib/python3.6/dist-packages/xgboost/core.py in predict(self, data, output_margin, ntree_limit, pred_leaf, pred_contribs, approx_contribs)
1042 option_mask |= 0x08
1043
-> 1044 self._validate_features(data)
1045
1046 length = c_bst_ulong()
/usr/local/lib/python3.6/dist-packages/xgboost/core.py in _validate_features(self, data)
1271 else:
1272 # Booster can't accept data with different feature names
-> 1273 if self.feature_names != data.feature_names:
1274 dat_missing = set(self.feature_names) - set(data.feature_names)
1275 my_missing = set(data.feature_names) - set(self.feature_names)
/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in __getattr__(self, name)
3612 if name in self._info_axis:
3613 return self[name]
-> 3614 return object.__getattribute__(self, name)
3615
3616 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'feature_names'
The problem has been solved. The problem is, I didn't converted the X_train.iloc[val_idx] to xgb.DMatrix. After converting X_train.iloc[val_idx] and X_test to xgb.DMatrix the plroblem was gone!
Updated the following two lines:
oof[val_idx] = clf.predict(xgb.DMatrix(X_train.iloc[val_idx]), ntree_limit=clf.best_ntree_limit)
predictions += clf.predict(xgb.DMatrix(X_test), ntree_limit=clf.best_ntree_limit)/folds.n_splits

Categories

Resources