I have been told that using exec is a Very Bad Thing.
However, I'm new to python and trying to figure out how to dynamically create a bunch of global variables (I'm aware that this is also supposed to be a Bad Thing, but let's burn one bridge at a time, shall we?).
What this is doing: get a list of the current variables that need to be created (currently sitting in a CSV), get the unique ID's within that list, then create the necessary objects by appending the ID to the name and reading the content of another CSV into it.
import pandas as pd
def importtest():
ilist = pd.read_csv('Z:/fakepath/ID.csv')
for i in range(0, len(ilist['ID'].unique())):
tempID = ilist['ID'].unique()[i]
exec("variable%s = pd.read_csv('%s')" % (
str(tempID), 'Z:/fakepath/'+str(tempID)+'.csv'), globals())
i = i + 1
Is there another/better way to dynamically create/update the variables I need so they show up in the global scope?
String keys in globals() dictionary correspond to variable names, so, you don't need to use exec, you can write variable to globals hash directly:
globals()["variable" + str(tempID)] = pd.read_csv('Z:/fakepath/'+str(tempID)+'.csv')
Related
I want to iterate over a list, and then pass that variable to another Python file, witch writes that text.
forloop.py:
class Main:
def list():
list = ["a","b","c","d","e","f","g"]
for i in list:
print_this_variable = i
That iterates over the list, now i want to print the results in a separate file.
print.py:
from forloop import *
print(print_this_variable)
Thanks for the help.
You can't, the way you've configured things. The variable print_this_variable is local to list and won't be available outside of that method.
Here's one way to structure things (there are a variety of other ways, but your question isn't very clear about what you're actually trying to accomplish):
First, note that list is the name of the Python list data type -- you shouldn't use it as a name for functions or variables. Second, you shouldn't name variables the same as functions, because this will mask the function name and will probably bite you at some point.
So, in forloop.py, let's do this:
class Main:
def example_function(self):
data = ["a","b","c","d","e","f","g"]
for i in data:
self.print_this_variable = i
That makes print_this_variable an instance variable for Main objects.
In print.py, we could write:
import forloop
# We need to create a Main object
m = forloop.Main()
# The `print_this_variable` attribute isn't available until
# after we # call the `example_function` method.
m.example_function()
# Now we can ask for the instance attribute
print(m.print_this_variable)
Hello I am currently doing a project were I need up to 30.000 variables, which will be created dynamically. My problem however is accessing said variables dynamically , storing them in an array and accessing them like this works but I'd like to access them by name only. My code looks like:
NG=10
for i in range(1, NG+1 ):
globals()[f"u_{i}"] = i
print(u_{i})
Declaring variables like this works and they can be accessed by typing u_1, but the above print statement breaks the code.
Is there an option to access a variable similar to this in python?
You can access it the same way you set it:
globals()[f"u_{i}"]
Except I highly recommend you NOT to use global variables. You can use a dictionary; eg.
data = {}
data["some_key"] = 123
print(data["some_key"])
This will work the same way as does with global variables, except not having the pain of global variables.
Using a Dictionary would be the best option if you ask me. Just to give an example of a dummy assignment:
import random
a={} # the dictionary
random.seed(5)
for i in range(30000):
a['u'+str(i+1)]=random.random() # Or whatever value you want to put in the variable
print(a['u1']) # First variable and so on...
print(a['u2'])
Can you run a for loop over the names of multiple subsets?
For instance, I now have subsets dfVC1 up until dfVC20 and I would like to do something like:
for x in range(20):
print(dfVC[x])
I get this doesn't work... but wonder if there is a way to do this.
I'm going to assume your 'subsets' in this case are variables, named dbVC0, dbVC1, etc. Then, your problem is that you want to print all of them by number, but since they're variables, you can't.
One way to solve this would be to change how the 'subsets' are declared. Instead of
dfVC0 = ...
dfVC1 = ...
you could make one dfVC variable that's a dict, that holds all the others at their proper indices.
dfVC = {}
dfVC[0] = ...
dfVC[1] = ...
which would then allow you to access the various dbVC subsets in the way you're currently trying to.
But changing such a large part of the program isn't always possible. What you might be able to do instead is to figure out which object the dfVCs are attached to, and grab them by string.
If they're in the local namespace (i.e. were declared in the same function as you're currently executing in), you can call the built-in locals() to get a dict that you can then try to find your key in:
for x in range(20):
sname = f'dfVC{x}'
print(locals()[sname])
globals() can be used similarly, if your 'subsets' are in the global scope (i.e. declared outside of the current function).
And if your dfVC variables are attached to a class or module (or something else that behaves like a namespace), you can retrieve them using the built-in getattr() function:
for x in range(20):
sname = f'dfVC{x}'
print(getattr(self, sname)) # replace self with whichever object has the dbVC attached to it
I would like to use a function's parameter to create dynamic names of dataframes and/or objects in Python. I have about 40 different names so it would be really elegant to do this in a function. Is there a way to do this or do I need to do this via 'dict'? I read that 'exec' is dangerous (not that I could get this to work). SAS has this feature for their macros which is where I am coming from. Here is an example of what I am trying to do (using '#' for illustrative purposes):
def TrainModels (mtype):
model_#mtype = ExtraTreesClassifier()
model_#mtype.fit(X_#mtype, Y_#mtype)
TrainModels ('FirstModel')
TrainModels ('SecondModel')
You could use a dictionary for this:
models = {}
def TrainModels (mtype):
models[mtype] = ExtraTreesClassifier()
models[mtype].fit()
First of all, any name you define within your TrainModels function will be local to that function, so won't be accessible in the rest of your program. So you have to define a global name.
Everything in Python is a dictionary, including the global namespace. You can define a new global name dynamically as follows:
my_name = 'foo'
globals()[my_name] = 'bar'
This is terrible and you should never do it. It adds too much indirection to your code. When someone else (or yourself in 3 months when the code is no longer fresh in your mind) reads the code and see 'foo' used elsewhere, they'll have a hard time figuring out where it came from. Code analysis tools will not be able to help you.
I would use a dict as Milkboat suggested.
I want to read an external data source (excel) and create variables containing the data. Suppose the data is in columns and each column has a header with the variable name.
My first idea is to write a function so i can easily reuse it. Also, I could easily give some additional keyword arguments to make the function more versatile.
The problem I'm facing is that I want to refer to the data in python (interactively) via the variable names. I don't know how to do that (with a function). The only solution I see is returning the variable names and the data from my function (eg as lists), and do something like this:
def get_data()
(...)
return names, values
names, values = get_data(my_excel)
for n,v in zip(names, values):
exec(''.join([n, '= v']))
Can I get the same result directly?
Thanks,
Roel
Use a dictionary to store your mapping from name to value instead of creating local variable.
def get_data(excel_document):
mapping = {}
mapping['name1'] = 'value1'
# ...
return mapping
mapping = get_data(my_excel)
for name, value in mapping:
# use them
If you really want to populate variables from the mapping, you can modify globals() (or locals()), but it is generally considered bad practice.
mapping = get_data(my_excel)
globals().update(mapping)
If you just want to set local variables for each name in names, use:
for n, v in zip(names, values):
locals()[n] = v
If you'd rather like to have a single object to access the data, which is much cleaner, simply use a dict, and return that from your function.
def get_data():
(...)
return dict(zip(names, values))
To access the value of the name "a", simply use get_data()["a"].
Finally, if you want to access the data as attributes of an object, you can update the __dict__ of an object (unexpected behaviour may occur if any of your column names are equal to any special python methods).
class Data(object):
def __init__(self, my_excel):
(...)
self.__dict__.update(zip(names, values))
data = Data("test.xls")
print data.a
The traditional approach would be to stuff the key/value pairs into a dict so that you can easily pass the whole structure around to other functions. If you really want to store them as attributes instead of dict keys, consider creating a class to hold them:
class Values(object): pass
store = Values()
for key, value in zip(names, values):
setattr(store, key, value)
That keeps the variables in their own namespace, separate from your running code. That's almost always a Good Thing. What if you get a spreadsheet with a header called "my_excel"? Suddenly you've lost access to your original my_excel object, which would be very inconvenient if you needed it again.
But in any case, you should never use exec unless you know exactly what you're doing. And even then, don't use exec. For instance, I know how your code works and send you a spreadsheet with "os.system('echo rm -rf *')" in a cell. You probably don't really want to execute that.