'DataFrame' object has no attribute 'str'

'DataFrame' object has no attribute 'str' - python

I'm trying to loop over columns to find a 0 in a specific cell (eg 'Users 0') in all columns of the df and replace the cell with null.
I tried running this :
for col in df.columns:
df.loc[sa[col].str.contains('0'), col] = ''
But it gives me 'DataFrame' object has no attribute 'str'

This could be because your dataframe has multiple columns with the same name. I can recreate this error by doing the following:
import pandas as pd
df = pd.DataFrame([['0','1','2'],['2','4','5']],columns = ['a','b','b'])
for c in df.columns:
print(df[c].str.replace("1",""))
The problem is that once you get to the repeated column name (in my example, when c == 'b'), then df[c] is actually a dataframe with 2 cols and .str is not available.
So if this is your issue, find the columns with the same name and give them unique names.
Also, as mentioned by #JonClements it isn't necessary to loop over the columns at all, you can just do df = df.replace('.*0.*', '', regex=True)

Related

how to extract data from a cell with df into a new column with dict format pandas

csv with df
import pandas as pd
df = pd.read_csv('loves_1.csv')
in the column FuelPrices you'll see another df
df1 = pd.DataFrame(df['FuelPrices'][0])
df1
so, how to extract values of LastPriceChangeDateTime and CashPrice as a key:value pair in to a new column of the main df for DIESEL only(df['diesel_price_change'])?
eventually, i want to append in that column dict with LastPriceChangeDateTime: CashPrice every time it's changed
i tried to loop with bunch of parameters but seems like somthing is messed up
for index, row in df.iterrows():
dfnew = pd.DataFrame(df['FuelPrices'][index])
dfnew['price_change'] = dfnew.apply(lambda row: {row['LastPriceChangeDateTime']: row['CashPrice']}, axis=1)
df['diesel_price_change'][index] = dfnew.apply(lambda x: y['price_change'] for y in x if y['ProductName'] == 'DIESEL')
i receive "'int' object is not iterable"

Unfortunately, The only way I found is to loop through it, but I still hope that i'll find pandas solution for it.
for index, row in df.iterrows():
for row in df['FuelPrices'][index]:
if row['ProductName'] == 'DIESEL':
df['diesel_price_change'][index] = {row['LastPriceChangeDateTime']:row['CashPrice']}

can you try this:
df['test_v1']=df['FuelPrices'].apply(lambda x: {x[0]['LastPriceChangeDateTime']:x[0]['CashPrice']})
if you are getting TypeError: string indices must be integers use:
import ast
df['FuelPrices']=df['FuelPrices'].apply(ast.literal_eval)
df['test_v1']=df['FuelPrices'].apply(lambda x: {x[0]['LastPriceChangeDateTime']:x[0]['CashPrice']})

Indexing by row name

Can someone please help me with this. I want to call rows by name, so I used set_index on the 1st column in the dataframe to index the rows by name instead of using integers for indexing.
# Set 'Name' column as index on a Dataframe
df1 = df1.set_index("Name", inplace = True)
df1
Output:
AttributeError: 'NoneType' object has no attribute 'set_index'
Then I run the following code:
result = df1.loc["ABC4"]
result
Output:
AttributeError: 'NoneType' object has no attribute 'loc'
I don't usually run a second code that depends on the 1st before fixing the error, but originally I run them together in one Jupyter notebook cell. Now I see that the two code cells have problems.
Please let me know where I went wrong. Thank you!

Maybe you should define your dataframe?
import pandas as pd
df1 = pd.DataFrame("here's your dataframe")
df1.set_index("Name")
or just
import pandas as pd
df1 = pd.DataFrame("here's your dataframe").set_index("Name")
df1

Your variable "df1" is not defined anywhere before doing something with it.
Try this:
# Set 'Name' column as index on a Dataframe
df1 = ''
df1 = df1.set_index("Name", inplace = True)
If its defined before, its value is NONE. So check this variable first.
The rest of the code "SHOULD" work afterwards.

python - iterate list of column names and access to column

I have a DataFrame called df want to iterate the columns_to_encode list and get the value of df.column but I'm getting the following error (as expected). Any idea about how cancould I do it?
columns_to_encode = ['column1','column2','column3']
for column in columns_to_encode:
df.column
AttributeError: 'DataFrame' object has no attribute 'column'

Try this code, this will solve your issue:
columns_to_encode = ['column1','column2','column3']
for column in columns_to_encode:
df[column]

Change DataTypes of Pandas Columns by selecting columns by regex

I have a Pandas dataframe with a lot of columns looking like p_d_d_c0, p_d_d_c1, ... p_d_d_g1, p_d_d_g2, ....
df =
a b c p_d_d_c0 p_d_d_c1 p_d_d_c2 ... p_d_d_g0 p_d_d_g1 ...
All these columns, which confirm to the regex need to be selected and their datatypes need to be changed from object to float. In particular, columns look like p_d_d_c* and p_d_d_g* are they are all object types and I would like to change them to float types. Is there a way to select columns in bulk by using regular expression and change them to float types?
I tried the answer from here, but it takes a lot of time and memory as I have hundreds of these columns.
df[df.filter(regex=("p_d_d_.*"))
I also tried:
df.select(lambda col: col.startswith('p_d_d_g'), axis=1)
But, it gives an error:
AttributeError: 'DataFrame' object has no attribute 'select'
My Pandas version is 1.0.1
So, how to select columns in bulk and change their data types using regex?

Try this:
import pandas as pd
# sample dataframe
df = pd.DataFrame(data={"co1":[1,2,3,4], "co22":[4,3,2,1], "co3":[2,3,2,4], "abc":[5,4,3,2]})
# select all columns which have co in it
floatcols = [col for col in df.columns if "co" in col]
for floatcol in floatcols:
df[floatcol] = df[floatcol].astype(float)

From the same link, and with some astype magic.
column_vals = df.columns.map(lambda x: x.startswith("p_d_d_"))
train_temp = df.loc(axis=1)[column_vals]
train_temp = train_temp.astype(float)
EDIT:
To modify the original dataframe, do something like this:
column_vals = [x for x in df.columns if x.startswith("p_d_d_")]
df[column_vals] = df[column_vals].astype(float)

DataFrame object has no attribute 'name'

I currently have a list of Pandas DataFrames. I'm trying to perform an operation on each list element (i.e. each DataFrame contained in the list) and then save that DataFrame to a CSV file.
I assigned a name attribute to each DataFrame, but I realized that in some cases the program throws an error AttributeError: 'DataFrame' object has no attribute 'name'.
Here's the code that I have.
# raw_og contains the file names for each CSV file.
# df_og is the list containing the DataFrame of each file.
for idx, file in enumerate(raw_og):
df_og.append(pd.read_csv(os.path.join(data_og_dir, 'raw', file)))
df_og[idx].name = file
# I'm basically checking if the DataFrame is in reverse-chronological order using the
# check_reverse function. If it is then I simply reverse the order and save the file.
for df in df_og:
if (check_reverse(df)):
df = df[::-1]
df.to_csv(os.path.join(data_og_dir, 'raw_new', df.name), index=False)
else:
continue
The program is throwing an error in the second for loop where I used df.name.
This is especially strange because when I run print(df.name) it prints out the file name. Would anybody happen to know what I'm doing wrong?
Thank you.

the solution is to use a loc to set the values, rather than creating a copy.
creating a copy of df loses the name:
df = df[::-1] # creates a copy
setting the value 'keeps' the original object intact, along with name
df.loc[:] = df[:, ::-1] # reversal maintaining the original object
Example code that reverses values along the column axis:
df = pd.DataFrame([[6,10]], columns=['a','b'])
df.name='t'
print(df.name)
print(df)
df.iloc[:] = df.iloc[:,::-1]
print(df)
print(df.name)
outputs:
t
a b
0 6 10
a b
0 10 6
t

A workaround is to set a columns.name and use it when needed.
Example:
df = pd.DataFrame()
df.columns.name = 'name'
print(df.columns.name)
name

I suspect it's the reversal that loses the custom .name attribute.
In [11]: df = pd.DataFrame()
In [12]: df.name = 'empty'
In [13]: df.name
Out[13]: 'empty'
In [14]: df[::-1].name
AttributeError: 'DataFrame' object has no attribute 'name'
You'll be better off storing a dict of dataframes rather than using .name:
df_og = {file: pd.read_csv(os.path.join(data_og_dir, 'raw', fn) for fn in raw_og}
Then you could iterate through this and reverse the values that need reversing...
for fn, df in df_og.items():
if (check_reverse(df)):
df = df[::-1]
df.to_csv(os.path.join(data_og_dir, 'raw_new', fn), index=False)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

'DataFrame' object has no attribute 'str' - python

I'm trying to loop over columns to find a 0 in a specific cell (eg 'Users 0') in all columns of the df and replace the cell with null. I tried running this : for col in df.columns: df.loc[sa[col].str.contains('0'), col] = '' But it gives me 'DataFrame' object has no attribute 'str'

Related

how to extract data from a cell with df into a new column with dict format pandas

Indexing by row name

python - iterate list of column names and access to column

Change DataTypes of Pandas Columns by selecting columns by regex

DataFrame object has no attribute 'name'

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

'DataFrame' object has no attribute 'str' ​ - python

I'm trying to loop over columns to find a 0 in a specific cell (eg 'Users 0') in all columns of the df and replace the cell with null. I tried running this : for col in df.columns: df.loc[sa[col].str.contains('0'), col] = '' But it gives me 'DataFrame' object has no attribute 'str' ​

Related

how to extract data from a cell with df into a new column with dict format pandas

Indexing by row name

python - iterate list of column names and access to column

Change DataTypes of Pandas Columns by selecting columns by regex

DataFrame object has no attribute 'name'

Categories

Resources

'DataFrame' object has no attribute 'str' - python

I'm trying to loop over columns to find a 0 in a specific cell (eg 'Users 0') in all columns of the df and replace the cell with null. I tried running this : for col in df.columns: df.loc[sa[col].str.contains('0'), col] = '' But it gives me 'DataFrame' object has no attribute 'str'