Generating a list from several dataframes - python

I have several data frames named data1, data2, data3, data4, ... data100. How can I store them in a list so that I'm able to plot them with a for loop.
Thanks in advance.

The problem you are experiencing is a symptom of using numbered variables.
You can avoid the problem entirely by using a list of DataFrames (e.g. data)
instead of using numbered variables (data1, data2, data3, etc.)
The trick is to avoid creating the numbered variables in the first place.
If you have code of the form
data1 = ...
data2 = ...
data3 = ...
Try to replace it with something like
data = []
data.append(...)
data.append(...)
data.append(...)
or better yet, use a list comprehension or a for-loop to define data. For more specific suggestions, show us the code that defines the numbered variables.
Then you could loop over the DataFrames with
for df in data:
df.plot(...)
If for some reason you can not prevent (someone else?) from defining the numbered variables, then you could use globals() (or locals()) to access the numbered variables programmatically:
g = globals()
data = [g['data{}'.format(i)] for i in range(1, 101)]
globals() returns a dictionary whose keys are string representations of names
in the global namespace. The associated values are the Python objects bound to
those names. Thus, you can use globals() to look up the values bound to
variable names based on the string representation of those variable names.
Use locals() if the variable names are defined in the local (rather than global) namespace.
Still, try to avoid using numbered variables. This use of globals() is merely a workaround for trouble someone else is causing. Using string formatting to look up variable names is not great programming style when simple integer indexing (of a list) should suffice. The best solution is to convince that someone to stop using numbered variables and to instead deliver the values in a list.

Related

Create a lengthy list of variable names in python with for loop

I have a bunch of data frames whose names are Actuals_17, Actuals_18,...Actuals45.
I would like to create a list of these variables. I know that it can be created by manually typing:
varlist = [Actuals_17,Actuals_18,Actuals19,Actuals20]
However, it would not result in a neat code if the number of variables are large. Is there a method to make the varlist with a for loop?
You can use python built-in globals() function which returns a dictionary representing the current global symbol table.
varlist = [globals()[f'Actuals_{i}'] for i in range(17, 46)]

Loop to dynamically assign new pandas DataFrame variables

I have an imports dictionary with:
keys equal to names of new variables I would like to build, for example dataset_1, dataset_2 etc.
values being the pandas DataFrames (the type of each value is pd.DataFrame)
What I would like to achieve is to build new variables in amount of len(keys). The name of each variable would be equal to the name of key and the variable would hold a respective pd.DataFrame.
The code below doesn't work, but nevertheless, I have deep feeling that still it's a bad approach and a 'regular programmer' would do this another way.
for key in imports.keys():
import_str = '{} = imports.get({})'.format(key, key)
globalize = 'global {}'.format(key)
exec(globalize)
exec(import_str)
Can you please advise how to proceed?

Creating/Getting/Extracting multiple data frames from python dictionary of dataframes

I have a python dictionary with keys as dataset names and values as the entire data frames themselves, see the dictionary dict below
[Dictionary of Dataframes ]
One way id to write all the codes manually like below:
csv = dict['csv.pkl']
csv_emp = dict['csv_emp.pkl']
csv_emp_yr= dict['csv_emp_yr.pkl']
emp_wf=dict['emp_wf.pkl']
emp_yr_wf=dict['emp_yr_wf.pkl']
But this will get very inefficient with more number of datasets.
Any help on how to get this done over a loop?
Although I would not recommend this method but you can try this:
import sys
this = sys.modules[__name__] # this is now your current namespace
for key in dict.keys():
setattr(this, key, dict[key])
Now you can check new variables made with names same as keys of dictionary.
globals() has risk as it gives you what the namespace is currently pointing to but this can change and so modifying the return from globals() is not a good idea
List can also be used like (limited usecases):
dataframes = []
for key in dict.keys():
dataframes.append(dict[key])
Still this is your choice, both of the above methods have some limitations.

Alternatives for exec to get values of any variable

For debugging purposes I need to know the values of some variables in Python. But, I don't want create a dictionary with all variables and don’t want to add every new variable that I want to test at some point to a dictionary, especially when it comes to lists and their content, which is often procedurally generated.
So is there an alternative to just taking input as a string and then executing it by exec() or using print() and hardcoded variables?
Yes, there is a way. You can use the locals() function to get a dictionary of all variables. For example:
a=5
b=locals()["a"]
Now b will get the value of a, i.e. 5.
However, while you can do this doesn't mean you should do this. There may be something wrong in the structure of your program if you want to access variables by using their name stored in a string.

Python: Splitting lists into multiple lists

In python, Suppose I have a list:
instruments = ['apd', 'dd', 'dow', 'ecl']
How can I split these lists so that it will create:
apd[]
dd[]
dow[]
ecl[]
Thanks for the help.
You would do this:
dictionaries = {i:[] for i in instruments}
and you you would refer to each list this way:
dictionaries['apd']
dictionaries['dd']
dictionaries['dow']
dictionaries['ecl']
This is considered much better practice than actually having the lists in the current namespace, as it would both be polluting and unpythonic.
mshsayem has the method to place the lists in the current scope, but the question is, what benefits do you get from putting them in your current scope?
Standard use cases:
you already know the names of the items, and want to refer to them directly by name, i.e. apd.append
you don't know the names yet, but you'll use eval or locals to get the lists, i.e. eval('apd').append or locals()['apd'].append
Both can be satisfied using dictionaries:
dictionaries['<some name can be set programatically or using a constant>'].append
Try this:
instruments = ['apd', 'dd', 'dow', 'ecl']
l_dict = locals()
for i in instruments:
l_dict[i] = []
This will create apd,dd,dow,ecl lists in the local scope.
Snakes and Cofee's idea is better though.

Categories

Resources