'str' object has no attribute 'loc' when using dataframe string name - python

I have dataframe's name like this df_1,df_2, df_3... df_10, now I am creating a loop from 1 to 10 and each loop refers to different dataframe which name is df+ loop name
name=['1','2','3','4','5','6','7','8','9','10']
for na in name:
data=f'df_{na}'.iloc[:,0]
if I do like above, I got an error of AttributeError: 'str' object has no attribute 'loc'
so I need to convert the string into dataframe's name
how to do it?

Based on our chat, you're trying to make 100 copies of a single dataframe. Since making variable variables is bad, use a dict instead:
names = ["df_" + int(i) for i in range(1, 101)]
dataframes = {name: df.copy() for name in names} # df is the existing dataframe
for key, dataframe in dataframes.items():
temp_data = dataframe.iloc(:, 0)
do_something(temp_data)

Related

python - iterate list of column names and access to column

I have a DataFrame called df want to iterate the columns_to_encode list and get the value of df.column but I'm getting the following error (as expected). Any idea about how cancould I do it?
columns_to_encode = ['column1','column2','column3']
for column in columns_to_encode:
df.column
AttributeError: 'DataFrame' object has no attribute 'column'
Try this code, this will solve your issue:
columns_to_encode = ['column1','column2','column3']
for column in columns_to_encode:
df[column]

DataFrame object has no attribute 'name'

I currently have a list of Pandas DataFrames. I'm trying to perform an operation on each list element (i.e. each DataFrame contained in the list) and then save that DataFrame to a CSV file.
I assigned a name attribute to each DataFrame, but I realized that in some cases the program throws an error AttributeError: 'DataFrame' object has no attribute 'name'.
Here's the code that I have.
# raw_og contains the file names for each CSV file.
# df_og is the list containing the DataFrame of each file.
for idx, file in enumerate(raw_og):
df_og.append(pd.read_csv(os.path.join(data_og_dir, 'raw', file)))
df_og[idx].name = file
# I'm basically checking if the DataFrame is in reverse-chronological order using the
# check_reverse function. If it is then I simply reverse the order and save the file.
for df in df_og:
if (check_reverse(df)):
df = df[::-1]
df.to_csv(os.path.join(data_og_dir, 'raw_new', df.name), index=False)
else:
continue
The program is throwing an error in the second for loop where I used df.name.
This is especially strange because when I run print(df.name) it prints out the file name. Would anybody happen to know what I'm doing wrong?
Thank you.
the solution is to use a loc to set the values, rather than creating a copy.
creating a copy of df loses the name:
df = df[::-1] # creates a copy
setting the value 'keeps' the original object intact, along with name
df.loc[:] = df[:, ::-1] # reversal maintaining the original object
Example code that reverses values along the column axis:
df = pd.DataFrame([[6,10]], columns=['a','b'])
df.name='t'
print(df.name)
print(df)
df.iloc[:] = df.iloc[:,::-1]
print(df)
print(df.name)
outputs:
t
a b
0 6 10
a b
0 10 6
t
A workaround is to set a columns.name and use it when needed.
Example:
df = pd.DataFrame()
df.columns.name = 'name'
print(df.columns.name)
name
I suspect it's the reversal that loses the custom .name attribute.
In [11]: df = pd.DataFrame()
In [12]: df.name = 'empty'
In [13]: df.name
Out[13]: 'empty'
In [14]: df[::-1].name
AttributeError: 'DataFrame' object has no attribute 'name'
You'll be better off storing a dict of dataframes rather than using .name:
df_og = {file: pd.read_csv(os.path.join(data_og_dir, 'raw', fn) for fn in raw_og}
Then you could iterate through this and reverse the values that need reversing...
for fn, df in df_og.items():
if (check_reverse(df)):
df = df[::-1]
df.to_csv(os.path.join(data_og_dir, 'raw_new', fn), index=False)

Unpivot dataframe in Python - 'builtin_function_or_method' object has no attribute 'insert'

I unpivoted a dataframe:
Like this:
full_unpivot = full.unstack.reset_index(name='Value')
full_unpivot.rename(columns={'level_0': 'Attribute', 'level_1': 'Scenario'}, inplace=True)
Now I wanted to drop decimals in values and add a column filled with 1 or -1 depending on the sign of the 'value' column.
However when I try to do:
full_unpivot = full_unpivot.applymap(np.int64)
or
list='Value'
full_unpivot[list] = full_unpivot[list].astype(int)
or
full_unpivot = full_unpivot.insert(4,'sign',1)
I get an error:
'builtin_function_or_method' object has no attribute 'insert'
Does anyone know what could be the problem.. ?
Thanks in advance!
I believe you need numpy.sign:
full_unpivot['sign'] = np.sign(full_unpivot['value'])
Problem in your code should be used variable list, what is code word.
Solution should be reassign to builtins:
list = builtins.list
Also if want use insert to second column called sign filled values of function np.sign use:
full_unpivot.insert(1,'sign',np.sign(full_unpivot['value']))

Python pd using a variable with column name in groupby dot notation

I am trying to use a list that holds the column names for my groupby notation. My end goal is to loop through multiple columns and run the calculation without having to re-write the same line multiple times. Is this possible?
a_list = list(['','BTC_','ETH_'])
a_variable = ('{}ClosePrice'.format(a_list[0]))
proccessing_data['RSI'] = proccessing_data.groupby('Symbol').**a_variable**.transform(lambda x: talib.RSI(x, timeperiod=14))
this is the error I currently get because it thinks I want the column 'a_variable' which doesn't exist.
AttributeError: 'DataFrameGroupBy' object has no attribute 'a_variable'
Apparently this notation below works:
proccessing_data['RSI'] = proccessing_data.groupby('Symbol')[('{}ClosePrice'.format(a_list[0]))].transform(lambda x: talib.RSI(x, timeperiod=14))

loop over names of several pandas DataFrames

I have a couple of DataFrames from different files, which are named for example df001, df002 and so on.
Now I want to loop over those DataFrames to execute similar tasks. But I can't figure out how to address them.
This failed (AttributeError: 'str' object has no attribute 'iloc'):
names = ['df001', 'df002']
for name in names:
name.iloc[1,1]
Can you try this?
names = [df001, df002]
for name in names:
name.iloc[1,1]
If you use the string name for purposes other than looping, you can always store the dataframes in a dictionary:
d = {'df001': df001, 'df002': df002}
for name in d:
d[name].iloc[1, 1]

Categories

Resources