I would like to use a function's parameter to create dynamic names of dataframes and/or objects in Python. I have about 40 different names so it would be really elegant to do this in a function. Is there a way to do this or do I need to do this via 'dict'? I read that 'exec' is dangerous (not that I could get this to work). SAS has this feature for their macros which is where I am coming from. Here is an example of what I am trying to do (using '#' for illustrative purposes):
def TrainModels (mtype):
model_#mtype = ExtraTreesClassifier()
model_#mtype.fit(X_#mtype, Y_#mtype)
TrainModels ('FirstModel')
TrainModels ('SecondModel')
You could use a dictionary for this:
models = {}
def TrainModels (mtype):
models[mtype] = ExtraTreesClassifier()
models[mtype].fit()
First of all, any name you define within your TrainModels function will be local to that function, so won't be accessible in the rest of your program. So you have to define a global name.
Everything in Python is a dictionary, including the global namespace. You can define a new global name dynamically as follows:
my_name = 'foo'
globals()[my_name] = 'bar'
This is terrible and you should never do it. It adds too much indirection to your code. When someone else (or yourself in 3 months when the code is no longer fresh in your mind) reads the code and see 'foo' used elsewhere, they'll have a hard time figuring out where it came from. Code analysis tools will not be able to help you.
I would use a dict as Milkboat suggested.
Related
Hello I am currently doing a project were I need up to 30.000 variables, which will be created dynamically. My problem however is accessing said variables dynamically , storing them in an array and accessing them like this works but I'd like to access them by name only. My code looks like:
NG=10
for i in range(1, NG+1 ):
globals()[f"u_{i}"] = i
print(u_{i})
Declaring variables like this works and they can be accessed by typing u_1, but the above print statement breaks the code.
Is there an option to access a variable similar to this in python?
You can access it the same way you set it:
globals()[f"u_{i}"]
Except I highly recommend you NOT to use global variables. You can use a dictionary; eg.
data = {}
data["some_key"] = 123
print(data["some_key"])
This will work the same way as does with global variables, except not having the pain of global variables.
Using a Dictionary would be the best option if you ask me. Just to give an example of a dummy assignment:
import random
a={} # the dictionary
random.seed(5)
for i in range(30000):
a['u'+str(i+1)]=random.random() # Or whatever value you want to put in the variable
print(a['u1']) # First variable and so on...
print(a['u2'])
Can you run a for loop over the names of multiple subsets?
For instance, I now have subsets dfVC1 up until dfVC20 and I would like to do something like:
for x in range(20):
print(dfVC[x])
I get this doesn't work... but wonder if there is a way to do this.
I'm going to assume your 'subsets' in this case are variables, named dbVC0, dbVC1, etc. Then, your problem is that you want to print all of them by number, but since they're variables, you can't.
One way to solve this would be to change how the 'subsets' are declared. Instead of
dfVC0 = ...
dfVC1 = ...
you could make one dfVC variable that's a dict, that holds all the others at their proper indices.
dfVC = {}
dfVC[0] = ...
dfVC[1] = ...
which would then allow you to access the various dbVC subsets in the way you're currently trying to.
But changing such a large part of the program isn't always possible. What you might be able to do instead is to figure out which object the dfVCs are attached to, and grab them by string.
If they're in the local namespace (i.e. were declared in the same function as you're currently executing in), you can call the built-in locals() to get a dict that you can then try to find your key in:
for x in range(20):
sname = f'dfVC{x}'
print(locals()[sname])
globals() can be used similarly, if your 'subsets' are in the global scope (i.e. declared outside of the current function).
And if your dfVC variables are attached to a class or module (or something else that behaves like a namespace), you can retrieve them using the built-in getattr() function:
for x in range(20):
sname = f'dfVC{x}'
print(getattr(self, sname)) # replace self with whichever object has the dbVC attached to it
In python 3.4, I want to be able to do a very simple dispatch table for testing purposes. The idea is to have a dictionary with the key being a string of the name of the function to be tested and the data item being the name of the test function.
For example:
myTestList = (
"myDrawFromTo",
"myDrawLineDir"
)
myTestDict = {
"myDrawFromTo": test_myDrawFromTo,
"myDrawLineDir": test_myDrawLineDir
}
for myTest in myTestList:
result = myTestDict[myTest]()
The idea is that I have a list of function names someplace. In this example, I manually create a dictionary that maps those names to the names of test functions. The test function names are a simple extension of the function name. I'd like to compute the entire dictionary from the list of function names (here it is myTestList).
Alternately, if I can do the same thing without the dictionary, that'd be fine, too. I tried just building a new string from the entries in myTestList and then using local() to set up the call, but didn't have any luck. The dictionary idea came from the Python 3.x documentation.
There are two parts to the problem.
The easy part is just prefixing 'text_' onto each string:
tests = {test: 'test_'+test for test in myTestDict}
The harder part is actually looking up the functions by name. That kind of thing is usually a bad idea, but you've hit on one of the cases (generating tests) where it often makes sense. You can do this by looking them up in your module's global dictionary, like this:
tests = {test: globals()['test_'+test] for test in myTestList}
There are variations on the same idea if the tests live somewhere other than the module's global scope. For example, it might be a good idea to make them all methods of a class, in which case you'd do:
tester = TestClass()
tests = {test: getattr(tester, 'test_'+test) for test in myTestList}
(Although more likely that code would be inside TestClass, so it would be using self rather than tester.)
If you don't actually need the dict, of course, you can change the comprehension to an explicit for statement:
for test in myTestList:
globals()['test_'+test]()
One more thing: Before reinventing the wheel, have you looked at the testing frameworks built into the stdlib, or available on PyPI?
Abarnert's answer seems to be useful but to answer your original question of how to call all test functions for a list of function names:
def test_f():
print("testing f...")
def test_g():
print("testing g...")
myTestList = ['f', 'g']
for funcname in myTestList:
eval('test_' + funcname + '()')
I am writing some program using python and the z3py module.
What I am trying to do is the following: I extract a constraint of an if or a while statement from a function which is located in some other file. Additionally I extract the used variables in the statement as well as their types.
As I do not want to parse the constraint by hand into a z3py friendly form, I tried to use evaluate to do this for me. Therefore I used the tip of the following page: Z3 with string expressions
Now the problem is: I do not know how the variables in the constraint are called. But it seems as I have to name the handle of each variable like the actual variable. Otherwise evaluate won't find it. My code looks like this:
solver = Solver()
# Look up the constraint:
branch = bd.getBranchNum(0)
constr = branch.code
# Create handle for each variable, depending on its type:
for k in mapper.getVariables():
var = mapper.getVariables()[k]
if k in constr:
if var.type == "intNum":
Int(k)
else:
Real(k)
# Evaluate constraint, insert the result and solve it:
f = eval(constr)
solver.insert(f)
solve(f)
As you can see I saved the variables and constraints in classes. When executing this code I get the following error:
NameError: name 'real_x' is not defined
If I do not use the looping over the variables, but instead the following code, everything works fine:
solver = Solver()
branch = bd.getBranchNum(0)
constr = branch.code
print(constr)
real_x = Real('real_x')
int_y = Int('int_y')
f = eval(constr)
print(f)
solver.insert(f)
solve(f)
The problem is: I do not know, that the variables are called "real_x" or "int_y". Furthermore I do not know how many variables there are used, which means I have to use some dynamic thing like a loop.
Now my question is: Is there a way around this? What can I do to tell python that the handles already exist, but have a different name? Or is my approach completely wrong and I have to do something totally different?
This kind of thing is almost always a bad idea (see Why eval/exec is bad for more details), but "almost always" isn't "always", and it looks like you're using a library that was specifically designed to be used this way, in which case you've found one of the exceptions.
And at first glance, it seems like you've also hit one of the rare exceptions to the Keep data out of your variable names guideline (also see Why you don't want to dynamically create variables). But you haven't.
The only reason you need these variables like real_x to exist is so that eval can see them, right? But the eval function already knows how to look for variables in a dictionary instead of in your global namespace. And it looks like what you're getting back from mapper.getVariables() is a dictionary.
So, skip that whole messy loop, and just do this:
variables = mapper.getVariables()
f = eval(constr, globals=variables)
(In earlier versions of Python, globals is a positional-only argument, so just drop the globals= if you get an error about that.)
As the documentation explains, this gives the eval function access to your actual variables, plus the ones the mapper wants to generate, and it can do all kinds of unsafe things. If you want to prevent unsafe things, do this:
variables = dict(mapper.getVariables())
variables['__builtins__'] = {}
f = eval(constr, globals=variables)
I know this must be a trivial question, but I've tried many different ways, and searched quie a bit for a solution, but how do I create and reference subfunctions in the current module?
For example, I am writing a program to parse through a text file, and for each of the 300 different names in it, I want to assign to a category.
There are 300 of these, and I have a list of these structured to create a dict, so of the form lookup[key]=value (bonus question; any more efficient or sensible way to do this than a massive dict?).
I would like to keep all of this in the same module, but with the functions (dict initialisation, etc) at the
end of the file, so I dont have to scroll down 300 lines to see the code, i.e. as laid out as in the example below.
When I run it as below, I get the error 'initlookups is not defined'. When I structure is so that it is initialisation, then function definition, then function use, no problem.
I'm sure there must be an obvious way to initialise the functions and associated dict without keeping the code inline, but have tried quite a few so far without success. I can put it in an external module and import this, but would prefer not to for simplicity.
What should I be doing in terms of module structure? Is there any better way than using a dict to store this lookup table (It is 300 unique text keys mapping on to approx 10 categories?
Thanks,
Brendan
import ..... (initialisation code,etc )
initLookups() # **Should create the dict - How should this be referenced?**
print getlookup(KEY) # **How should this be referenced?**
def initLookups():
global lookup
lookup={}
lookup["A"]="AA"
lookup["B"]="BB"
(etc etc etc....)
def getlookup(value)
if name in lookup.keys():
getlookup=lookup[name]
else:
getlookup=""
return getlookup
A function needs to be defined before it can be called. If you want to have the code that needs to be executed at the top of the file, just define a main function and call it from the bottom:
import sys
def main(args):
pass
# All your other function definitions here
if __name__ == '__main__':
exit(main(sys.argv[1:]))
This way, whatever you reference in main will have been parsed and is hence known already. The reason for testing __name__ is that in this way the main method will only be run when the script is executed directly, not when it is imported by another file.
Side note: a dict with 300 keys is by no means massive, but you may want to either move the code that fills the dict to a separate module, or (perhaps more fancy) store the key/value pairs in a format like JSON and load it when the program starts.
Here's a more pythonic ways to do this. There aren't a lot of choices, BTW.
A function must be defined before it can be used. Period.
However, you don't have to strictly order all functions for the compiler's benefit. You merely have to put your execution of the functions last.
import # (initialisation code,etc )
def initLookups(): # Definitions must come before actual use
lookup={}
lookup["A"]="AA"
lookup["B"]="BB"
(etc etc etc....)
return lookup
# Any functions initLookups uses, can be define here.
# As long as they're findable in the same module.
if __name__ == "__main__": # Use comes last
lookup= initLookups()
print lookup.get("Key","")
Note that you don't need the getlookup function, it's a built-in feature of a dict, named get.
Also, "initialisation code" is suspicious. An import should not "do" anything. It should define functions and classes, but not actually provide any executable code. In the long run, executable code that is processed by an import can become a maintenance nightmare.
The most notable exception is a module-level Singleton object that gets created by default. Even then, be sure that the mystery object which makes a module work is clearly identified in the documentation.
If your lookup dict is unchanging, the simplest way is to just make it a module scope variable. ie:
lookup = {
'A' : 'AA',
'B' : 'BB',
...
}
If you may need to make changes, and later re-initialise it, you can do this in an initialisation function:
def initLookups():
global lookup
lookup = {
'A' : 'AA',
'B' : 'BB',
...
}
(Alternatively, lookup.update({'A':'AA', ...}) to change the dict in-place, affecting all callers with access to the old binding.)
However, if you've got these lookups in some standard format, it may be simpler simply to load it from a file and create the dictionary from that.
You can arrange your functions as you wish. The only rule about ordering is that the accessed variables must exist at the time the function is called - it's fine if the function has references to variables in the body that don't exist yet, so long as nothing actually tries to use that function. ie:
def foo():
print greeting, "World" # Note that greeting is not yet defined when foo() is created
greeting = "Hello"
foo() # Prints "Hello World"
But:
def foo():
print greeting, "World"
foo() # Gives an error - greeting not yet defined.
greeting = "Hello"
One further thing to note: your getlookup function is very inefficient. Using "if name in lookup.keys()" is actually getting a list of the keys from the dict, and then iterating over this list to find the item. This loses all the performance benefit the dict gives. Instead, "if name in lookup" would avoid this, or even better, use the fact that .get can be given a default to return if the key is not in the dictionary:
def getlookup(name)
return lookup.get(name, "")
I think that keeping the names in a flat text file, and loading them at runtime would be a good alternative. I try to stick to the lowest level of complexity possible with my data, starting with plain text and working up to a RDMS (I lifted this idea from The Pragmatic Programmer).
Dictionaries are very efficient in python. It's essentially what the whole language is built on. 300 items is well within the bounds of sane dict usage.
names.txt:
A = AAA
B = BBB
C = CCC
getname.py:
import sys
FILENAME = "names.txt"
def main(key):
pairs = (line.split("=") for line in open(FILENAME))
names = dict((x.strip(), y.strip()) for x,y in pairs)
return names.get(key, "Not found")
if __name__ == "__main__":
print main(sys.argv[-1])
If you really want to keep it all in one module for some reason, you could just stick a string at the top of the module. I think that a big swath of text is less distracting than a huge mess of dict initialization code (and easier to edit later):
import sys
LINES = """
A = AAA
B = BBB
C = CCC
D = DDD
E = EEE""".strip().splitlines()
PAIRS = (line.split("=") for line in LINES)
NAMES = dict((x.strip(), y.strip()) for x,y in PAIRS)
def main(key):
return NAMES.get(key, "Not found")
if __name__ == "__main__":
print main(sys.argv[-1])