python - iterate list of column names and access to column - python

I have a DataFrame called df want to iterate the columns_to_encode list and get the value of df.column but I'm getting the following error (as expected). Any idea about how cancould I do it?
columns_to_encode = ['column1','column2','column3']
for column in columns_to_encode:
df.column
AttributeError: 'DataFrame' object has no attribute 'column'

Try this code, this will solve your issue:
columns_to_encode = ['column1','column2','column3']
for column in columns_to_encode:
df[column]

Related

Indexing by row name

Can someone please help me with this. I want to call rows by name, so I used set_index on the 1st column in the dataframe to index the rows by name instead of using integers for indexing.
# Set 'Name' column as index on a Dataframe
df1 = df1.set_index("Name", inplace = True)
df1
Output:
AttributeError: 'NoneType' object has no attribute 'set_index'
Then I run the following code:
result = df1.loc["ABC4"]
result
Output:
AttributeError: 'NoneType' object has no attribute 'loc'
I don't usually run a second code that depends on the 1st before fixing the error, but originally I run them together in one Jupyter notebook cell. Now I see that the two code cells have problems.
Please let me know where I went wrong. Thank you!
Maybe you should define your dataframe?
import pandas as pd
df1 = pd.DataFrame("here's your dataframe")
df1.set_index("Name")
or just
import pandas as pd
df1 = pd.DataFrame("here's your dataframe").set_index("Name")
df1
Your variable "df1" is not defined anywhere before doing something with it.
Try this:
# Set 'Name' column as index on a Dataframe
df1 = ''
df1 = df1.set_index("Name", inplace = True)
If its defined before, its value is NONE. So check this variable first.
The rest of the code "SHOULD" work afterwards.

index string has no method of isin()

I have a dataframe with index is string name like 'apple' etc.
Now I have a list
name_list=['apple','orange','tomato']
I'd like to filter dataframe rows by selecting rows with index is in the above list
df=df.loc[df.index.str.isin(name_list)]
then I got an error of
AttributeError: 'StringMethods' object has no attribute 'isin'
Use df.index.isin, not df.index.str.isin:
df = df.loc[df.index.isin(name_list)]
You can just do reindex
df = df.reindex(name_list)

'str' object has no attribute 'loc' when using dataframe string name

I have dataframe's name like this df_1,df_2, df_3... df_10, now I am creating a loop from 1 to 10 and each loop refers to different dataframe which name is df+ loop name
name=['1','2','3','4','5','6','7','8','9','10']
for na in name:
data=f'df_{na}'.iloc[:,0]
if I do like above, I got an error of AttributeError: 'str' object has no attribute 'loc'
so I need to convert the string into dataframe's name
how to do it?
Based on our chat, you're trying to make 100 copies of a single dataframe. Since making variable variables is bad, use a dict instead:
names = ["df_" + int(i) for i in range(1, 101)]
dataframes = {name: df.copy() for name in names} # df is the existing dataframe
for key, dataframe in dataframes.items():
temp_data = dataframe.iloc(:, 0)
do_something(temp_data)

'DataFrame' object has no attribute 'str' ​

I'm trying to loop over columns to find a 0 in a specific cell (eg 'Users 0') in all columns of the df and replace the cell with null.
I tried running this :
for col in df.columns:
df.loc[sa[col].str.contains('0'), col] = ''
But it gives me 'DataFrame' object has no attribute 'str'
​
This could be because your dataframe has multiple columns with the same name. I can recreate this error by doing the following:
import pandas as pd
df = pd.DataFrame([['0','1','2'],['2','4','5']],columns = ['a','b','b'])
for c in df.columns:
print(df[c].str.replace("1",""))
The problem is that once you get to the repeated column name (in my example, when c == 'b'), then df[c] is actually a dataframe with 2 cols and .str is not available.
So if this is your issue, find the columns with the same name and give them unique names.
Also, as mentioned by #JonClements it isn't necessary to loop over the columns at all, you can just do df = df.replace('.*0.*', '', regex=True)

Feed list of column names to list then calculate by specific column in dataframe

I have a wide dataset with specific columns I would like to multiply by another column with population weight, and replace the values once complete. When I run below (example) code, I get error: AttributeError: 'function' object has no attribute 'list'.
Please advise on how I can make this work, using the list. Thanks!
df = pd.DataFrame(np.random.randint(0,100,size=(15, 4)), columns=list('ABCD'))
df['WGT']=0.5
cols_to_calc=['A', 'C']
df.update(df.columns.isin.list(cols_to_calc)).mul(df['WGT'])
Do
df[cols_to_calc] = df[cols_to_calc].mul(df['WGT'], axis=0)

Categories

Resources