index string has no method of isin() - python

I have a dataframe with index is string name like 'apple' etc.
Now I have a list
name_list=['apple','orange','tomato']
I'd like to filter dataframe rows by selecting rows with index is in the above list
df=df.loc[df.index.str.isin(name_list)]
then I got an error of
AttributeError: 'StringMethods' object has no attribute 'isin'

Use df.index.isin, not df.index.str.isin:
df = df.loc[df.index.isin(name_list)]

You can just do reindex
df = df.reindex(name_list)

Related

python - iterate list of column names and access to column

I have a DataFrame called df want to iterate the columns_to_encode list and get the value of df.column but I'm getting the following error (as expected). Any idea about how cancould I do it?
columns_to_encode = ['column1','column2','column3']
for column in columns_to_encode:
df.column
AttributeError: 'DataFrame' object has no attribute 'column'
Try this code, this will solve your issue:
columns_to_encode = ['column1','column2','column3']
for column in columns_to_encode:
df[column]

Unpivot dataframe in Python - 'builtin_function_or_method' object has no attribute 'insert'

I unpivoted a dataframe:
Like this:
full_unpivot = full.unstack.reset_index(name='Value')
full_unpivot.rename(columns={'level_0': 'Attribute', 'level_1': 'Scenario'}, inplace=True)
Now I wanted to drop decimals in values and add a column filled with 1 or -1 depending on the sign of the 'value' column.
However when I try to do:
full_unpivot = full_unpivot.applymap(np.int64)
or
list='Value'
full_unpivot[list] = full_unpivot[list].astype(int)
or
full_unpivot = full_unpivot.insert(4,'sign',1)
I get an error:
'builtin_function_or_method' object has no attribute 'insert'
Does anyone know what could be the problem.. ?
Thanks in advance!
I believe you need numpy.sign:
full_unpivot['sign'] = np.sign(full_unpivot['value'])
Problem in your code should be used variable list, what is code word.
Solution should be reassign to builtins:
list = builtins.list
Also if want use insert to second column called sign filled values of function np.sign use:
full_unpivot.insert(1,'sign',np.sign(full_unpivot['value']))

pandas:drop multiple columns which name in a list and assigned to a new dataframe

I have a dataframe with several columns:
df
pymnt_plan ... settlement_term days
Now I know which columns I Want to delete/drop, based on the following list:
mylist = ['pymnt_plan',
'recoveries',
'collection_recovery_fee',
'policy_code',
'num_tl_120dpd_2m',
'hardship_flag',
'debt_settlement_flag_date',
'settlement_status',
'settlement_date',
'settlement_amount',
'settlement_percentage',
'settlement_term']
How to drop multiple columns which their names in a list and assigned to a new dataframe? In this case:
df2
days
You can do
new_df = df[list]
df = df.drop(columns=list)
In Pandas 0.20.3 using 'df = df.drop(columns=list)' I get:
TypeError: drop() got an unexpected keyword argument 'columns'
So you can use this instead:
df = df.drop(axis=1, labels=list)

ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series

I'm using Pandas 0.20.3 in my python 3.X. I want to add one column in a pandas data frame from another pandas data frame. Both the data frame contains 51 rows. So I used following code:
class_df['phone']=group['phone'].values
I got following error message:
ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series
class_df.dtypes gives me:
Group_ID object
YEAR object
Terget object
phone object
age object
and type(group['phone']) returns pandas.core.series.Series
Can you suggest me what changes I need to do to remove this error?
The first 5 rows of group['phone'] are given below:
0 [735015372, 72151508105, 7217511580, 721150431...
1 []
2 [735152771, 7351515043, 7115380870, 7115427...
3 [7111332015, 73140214, 737443075, 7110815115...
4 [718218718, 718221342, 73551401, 71811507...
Name: phoen, dtype: object
In most cases, this error comes when you return an empty dataframe. The best approach that worked for me was to check if the dataframe is empty first before using apply()
if len(df) != 0:
df['indicator'] = df.apply(assign_indicator, axis=1)
You have a column of ragged lists. Your only option is to assign a list of lists, and not an array of lists (which is what .value gives).
class_df['phone'] = group['phone'].tolist()
The error of the Question-Headline
"ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series"
might as well occur if for what ever reason the table does not have any rows.
Instead of using an if-statement, you can use set result_type argument of apply() function to "reduce".
df['new_column'] = df.apply(func, axis=1, result_type='reduce')
The data assigned to a column in the DataFrame must be a single dimension array. For example, consider a num_arr to be added to a DataFrame
num_arr.shape
(1, 126)
For this num_arr to be added to a DataFrame column, It should be reshaped....
num_arr = num_arr.reshape(-1, )
num_arr.shape
(126,)
Now I could set this arr as a DataFrame column
df = pd.DataFrame()
df['numbers'] = num_arr

Indexing pandas series with parent dataframe index

I wanted to run a function over each row of the pandas dataframe and output its value in the derived column score: The function shown below is a lambda for example but the function should be able to index by parent dataframe column labels and access column names like row['col1'] , but a series object is passed to the lambda function which loses the column label information:
eg:
def calculate(row):
cols=row.columns
loc=row['loc']
h=row['h']
isst=row['Ist']
Hol=row['Hol']
return loc+h+len(cols)
a['score']=a.apply(lambda row:calculate(row),axis=1)
gives:
AttributeError: ("'Series' object has no attribute 'columns'", u'occurred at index 0')
so how can I access a named series like a named tuple in the lambda function?
A quick hack would be to do:
a['score']=a.apply(lambda row:calculate(makedict(row,row.index)),axis=1)
where makedict function will create a dictionary for each row so that it can be accessed in the function by column labels. But is there an pandas way?
Finally found the to_dict function which helps this:
def calculate(row):
row=row.to_dict()
loc=row['loc']
h=row['h']
isst=row['Ist']
Hol=row['Hol']
return loc+h+len(row.keys())
a['score']=a.apply(calculate,axis=1)
Why not:
a['score']=a.apply(lambda row:row['loc'] + row['h']+len(row.index),axis=1)

Categories

Resources