I have a couple of DataFrames from different files, which are named for example df001, df002 and so on.
Now I want to loop over those DataFrames to execute similar tasks. But I can't figure out how to address them.
This failed (AttributeError: 'str' object has no attribute 'iloc'):
names = ['df001', 'df002']
for name in names:
name.iloc[1,1]
Can you try this?
names = [df001, df002]
for name in names:
name.iloc[1,1]
If you use the string name for purposes other than looping, you can always store the dataframes in a dictionary:
d = {'df001': df001, 'df002': df002}
for name in d:
d[name].iloc[1, 1]
Related
I have a script with if statements that has 14 possible dataframes
['result_14', 'result_13', 'result_12', 'result_11', 'result_10', 'result_9', 'result_8', 'result_7', 'result_6', 'result_5', 'result_4', 'result_3', 'result_2', 'result_1']
Not all dataframes are created every time I run the script. It is dependent on a secondary input variable. I am now attempting to concatenate dataframes but run into issue with those that do not exist.
pd.concat(([result_14, result_13, result_12, result_11, result_10, result_9, result_8, result_7, result_6, result_5, result_4, result_3, result_2, result_1]), ignore_index=True)
NameError: name 'result_13' is not defined
I have tried finding all dfs that exist in my python memory and parsing the results but this creates a list rather than a list of dataframes
alldfs = [var for var in dir() if isinstance(eval(var), pd.core.frame.DataFrame)]
SelectDFs = [s for s in alldfs if "result" in s]
SelectDFs
['result_14', 'result_15', 'result_12', 'result_11', 'result_10', 'result_9', 'result_8', 'result_7', 'result_6', 'result_5', 'result_4', 'result_3', 'result_2', 'result_1']
pd.concat(([SelectDFs]), ignore_index=True)
TypeError: cannot concatenate object of type '<class 'list'>'; only Series and DataFrame objs are valid
You can try
%who_ls DataFrame
# %whos DataFrame
In your case
l = %who_ls DataFrame
pd.concat([eval(dfn) for dfn in l if dfn.startswith('result')], ignore_index=True)
You are passing list of string and not Dataframe object.
And once you re able to get DF Object you can pass SelecteDFs without bracket.
pd.concat(SelectDFs, ignore_index=True)
Have you tried to convert them into DFs? I mean when you want to concat them, it raise an error which says your data need to be dfs rahter than lists, so have you tried to convert your lists into DFs?
this link may help you:
Convert List to Pandas Dataframe Column
I want to create n DataFrames using the value s as the name of each DataFrame, but I only could create a list full of DataFrames. It's possible to change this list in each of the DataFrames inside it?
#estacao has something like [ABc,dfg,hil,...,xyz], and this should be the name of each DataFrame
estacao = dados.Station.unique()
for s,i in zip(estacao,range(126)):
estacao[i] = dados.groupby('Station').get_group(s)
I'd use a dictionary here. Then you can name the keys with s and the values can each be the dataframe corresponding to that group:
groups = dados.Station.unique()
groupby_ = datos.groupby('Station')
dataframes = {s: groupby_.get_group(s) for s in groups}
Then calling each one by name is as simple as:
group_df = dataframes['group_name']
If you REALLY NEED to create DataFrames named after s (which I named group in the following example), using exec is the solution.
groups = dados.Station.unique()
groupby_ = dados.groupby('Station')
for group in groups:
exec(f"{group} = groupby_.get_group('{group:s}')")
CAVEAT
See this answer to understand why using exec and eval commands is not always desirable.
Why should exec() and eval() be avoided?
I have dataframe's name like this df_1,df_2, df_3... df_10, now I am creating a loop from 1 to 10 and each loop refers to different dataframe which name is df+ loop name
name=['1','2','3','4','5','6','7','8','9','10']
for na in name:
data=f'df_{na}'.iloc[:,0]
if I do like above, I got an error of AttributeError: 'str' object has no attribute 'loc'
so I need to convert the string into dataframe's name
how to do it?
Based on our chat, you're trying to make 100 copies of a single dataframe. Since making variable variables is bad, use a dict instead:
names = ["df_" + int(i) for i in range(1, 101)]
dataframes = {name: df.copy() for name in names} # df is the existing dataframe
for key, dataframe in dataframes.items():
temp_data = dataframe.iloc(:, 0)
do_something(temp_data)
I have a list of dataframes which I wish to convert to multiple csv.
Example:
List_Df = [df1,df2,df3,df4]
for i in List_Df:
i.to_csv("C:\\Users\\Public\\Downloads\\"+i+".csv")
Expected output: Having 4 csv files with the names df1.csv,df2.csv ...
But I am facing two problems:
First problem:
AttributeError: 'list' object has no attribute 'to_csv'
Second problem:
("C:\\Users\\Public\\Downloads\\"+ **i** +".csv") <- **i** returns the object
as it's suppose to but I wish for python to automatically take the
object_name and use it with .csv
Any help will be greatly appreciated as I am new to Python and SOF.
Thank you :)
Try this:
import pandas as pd
List_Df = [df1,df2,df3,df4]
for i,e in enumerate(List_Df):
df = pd.DataFrame(e)
df.to_csv("C:\\Users\\Public\\Downloads\\"+"df"+str(i)+".csv")
For your second problem you would have to e.g. name the dataframes first:
for j,df in enumerate(List_Df):
df.name = 'df'+str(j)
df.to_csv("C:\\Users\\Public\\Downloads\\%s.csv" %(df.name))
or even just take a string and add the index without naming the dataframes first:
for j,df in enumerate(List_Df):
name = 'df'+str(j)
df.to_csv("C:\\Users\\Public\\Downloads\\%s.csv" %(name))
I am trying to use a list that holds the column names for my groupby notation. My end goal is to loop through multiple columns and run the calculation without having to re-write the same line multiple times. Is this possible?
a_list = list(['','BTC_','ETH_'])
a_variable = ('{}ClosePrice'.format(a_list[0]))
proccessing_data['RSI'] = proccessing_data.groupby('Symbol').**a_variable**.transform(lambda x: talib.RSI(x, timeperiod=14))
this is the error I currently get because it thinks I want the column 'a_variable' which doesn't exist.
AttributeError: 'DataFrameGroupBy' object has no attribute 'a_variable'
Apparently this notation below works:
proccessing_data['RSI'] = proccessing_data.groupby('Symbol')[('{}ClosePrice'.format(a_list[0]))].transform(lambda x: talib.RSI(x, timeperiod=14))