Scope, using functions in current module - python

I know this must be a trivial question, but I've tried many different ways, and searched quie a bit for a solution, but how do I create and reference subfunctions in the current module?
For example, I am writing a program to parse through a text file, and for each of the 300 different names in it, I want to assign to a category.
There are 300 of these, and I have a list of these structured to create a dict, so of the form lookup[key]=value (bonus question; any more efficient or sensible way to do this than a massive dict?).
I would like to keep all of this in the same module, but with the functions (dict initialisation, etc) at the
end of the file, so I dont have to scroll down 300 lines to see the code, i.e. as laid out as in the example below.
When I run it as below, I get the error 'initlookups is not defined'. When I structure is so that it is initialisation, then function definition, then function use, no problem.
I'm sure there must be an obvious way to initialise the functions and associated dict without keeping the code inline, but have tried quite a few so far without success. I can put it in an external module and import this, but would prefer not to for simplicity.
What should I be doing in terms of module structure? Is there any better way than using a dict to store this lookup table (It is 300 unique text keys mapping on to approx 10 categories?
Thanks,
Brendan
import ..... (initialisation code,etc )
initLookups() # **Should create the dict - How should this be referenced?**
print getlookup(KEY) # **How should this be referenced?**
def initLookups():
global lookup
lookup={}
lookup["A"]="AA"
lookup["B"]="BB"
(etc etc etc....)
def getlookup(value)
if name in lookup.keys():
getlookup=lookup[name]
else:
getlookup=""
return getlookup

A function needs to be defined before it can be called. If you want to have the code that needs to be executed at the top of the file, just define a main function and call it from the bottom:
import sys
def main(args):
pass
# All your other function definitions here
if __name__ == '__main__':
exit(main(sys.argv[1:]))
This way, whatever you reference in main will have been parsed and is hence known already. The reason for testing __name__ is that in this way the main method will only be run when the script is executed directly, not when it is imported by another file.
Side note: a dict with 300 keys is by no means massive, but you may want to either move the code that fills the dict to a separate module, or (perhaps more fancy) store the key/value pairs in a format like JSON and load it when the program starts.

Here's a more pythonic ways to do this. There aren't a lot of choices, BTW.
A function must be defined before it can be used. Period.
However, you don't have to strictly order all functions for the compiler's benefit. You merely have to put your execution of the functions last.
import # (initialisation code,etc )
def initLookups(): # Definitions must come before actual use
lookup={}
lookup["A"]="AA"
lookup["B"]="BB"
(etc etc etc....)
return lookup
# Any functions initLookups uses, can be define here.
# As long as they're findable in the same module.
if __name__ == "__main__": # Use comes last
lookup= initLookups()
print lookup.get("Key","")
Note that you don't need the getlookup function, it's a built-in feature of a dict, named get.
Also, "initialisation code" is suspicious. An import should not "do" anything. It should define functions and classes, but not actually provide any executable code. In the long run, executable code that is processed by an import can become a maintenance nightmare.
The most notable exception is a module-level Singleton object that gets created by default. Even then, be sure that the mystery object which makes a module work is clearly identified in the documentation.

If your lookup dict is unchanging, the simplest way is to just make it a module scope variable. ie:
lookup = {
'A' : 'AA',
'B' : 'BB',
...
}
If you may need to make changes, and later re-initialise it, you can do this in an initialisation function:
def initLookups():
global lookup
lookup = {
'A' : 'AA',
'B' : 'BB',
...
}
(Alternatively, lookup.update({'A':'AA', ...}) to change the dict in-place, affecting all callers with access to the old binding.)
However, if you've got these lookups in some standard format, it may be simpler simply to load it from a file and create the dictionary from that.
You can arrange your functions as you wish. The only rule about ordering is that the accessed variables must exist at the time the function is called - it's fine if the function has references to variables in the body that don't exist yet, so long as nothing actually tries to use that function. ie:
def foo():
print greeting, "World" # Note that greeting is not yet defined when foo() is created
greeting = "Hello"
foo() # Prints "Hello World"
But:
def foo():
print greeting, "World"
foo() # Gives an error - greeting not yet defined.
greeting = "Hello"
One further thing to note: your getlookup function is very inefficient. Using "if name in lookup.keys()" is actually getting a list of the keys from the dict, and then iterating over this list to find the item. This loses all the performance benefit the dict gives. Instead, "if name in lookup" would avoid this, or even better, use the fact that .get can be given a default to return if the key is not in the dictionary:
def getlookup(name)
return lookup.get(name, "")

I think that keeping the names in a flat text file, and loading them at runtime would be a good alternative. I try to stick to the lowest level of complexity possible with my data, starting with plain text and working up to a RDMS (I lifted this idea from The Pragmatic Programmer).
Dictionaries are very efficient in python. It's essentially what the whole language is built on. 300 items is well within the bounds of sane dict usage.
names.txt:
A = AAA
B = BBB
C = CCC
getname.py:
import sys
FILENAME = "names.txt"
def main(key):
pairs = (line.split("=") for line in open(FILENAME))
names = dict((x.strip(), y.strip()) for x,y in pairs)
return names.get(key, "Not found")
if __name__ == "__main__":
print main(sys.argv[-1])
If you really want to keep it all in one module for some reason, you could just stick a string at the top of the module. I think that a big swath of text is less distracting than a huge mess of dict initialization code (and easier to edit later):
import sys
LINES = """
A = AAA
B = BBB
C = CCC
D = DDD
E = EEE""".strip().splitlines()
PAIRS = (line.split("=") for line in LINES)
NAMES = dict((x.strip(), y.strip()) for x,y in PAIRS)
def main(key):
return NAMES.get(key, "Not found")
if __name__ == "__main__":
print main(sys.argv[-1])

Related

Can you call/use a function returned from a list in Python?

I'm trying to store a function in a list, retrieve the function from the list later, and then call on that function. This is basically what I want to do, without any specifics. It doesn't show my purpose, but it's the same issue.
elements: list = [] # List meant to contain a tuple with the name of the item and the function of the item.
def quit_code():
exit()
element.append(("quit", quit_code))
Now, somewhere else in the code, I want to be able to use an if statement to check the name of the item and, if it's the right one at that time, run the function.
user_input = "quit" # For brevity, I'm just writing this. Let's just imagine the user actually typed this.
if elements[0][0] == user_input:
#This is the part I don't understand so I'm just going to make up some syntax.
run_method(elements[0][1])
The method run_method that I arbitrarily made is the issue. I need a way to run the method returned by elements[0][1], which is the quit_code method. I don't need an alternative solution to this example because I just made it up to display what I want to do. If I have a function or object that contains a function, how can I run that function.
(In the most simplified way I can word it) If I have object_a (for me it's a tuple) that contains str_1 and fun_b, how can I run fun_b from the object.
To expand on this a little more, the reason I can't just directly call the function is because in my program, the function gets put into the tuple via user input and is created locally and then stored in the tuple.
__list_of_stuff: list = []
def add_to_list(name, function):
__list_of_stuff.append((name, function))
And then somewhere else
def example_init_method():
def stop_code():
exit()
add_to_list("QUIT", stop_code())
Now notice that I can't access the stop_code method anywhere else in the code unless I use it through the __list_of_stuff object.
Finally, It would be nice to not have to make a function for the input. By this, I mean directly inserting code into the parameter without creating a local function like stop_code. I don't know how to do this though.
Python treats functions as first-class citizens. As such, you can do things like:
def some_function():
# do something
pass
x = some_function
x()
Since you are storing functions and binding each function with a word (key), the best approach would be a dictionary. Your example could be like this:
def quit_code():
exit()
operations = dict(quit=quit_code)
operations['quit']()
A dictionary relates a value with a key. The only rule is the key must be immutable. That means numbers, strings, tuples and other immutable objects.
To create a dictionary, you can use { and }. And to get a value by its key, use [ and ]:
my_dictionary = { 'a' : 1, 'b' : 10 }
print(my_dictionary['a']) # It will print 1
You can also create a dictionary with dict, like so:
my_dictionary = dict(a=1, b=10)
However this only works for string keys.
But considering you are using quit_code to encapsulate the exit call, why not using exit directly?
operations = dict(quit=exit)
operations['quit']()
If dictionaries aren't an option, you could still use lists and tuples:
operations = [('quit',exit)]
for key, fun in operations:
if key == 'quit':
fun()

Avoid global variable in a python util file

I have a utilities.py file for my python project. It contains only util functions, for example is_float(string), is_empty(file), etc.
Now I want to have a function is_valid(number), which has to:
read from a file, valid.txt, which contains all numbers which are valid, and load them onto a map/set.
check the map for the presence of number and return True or False.
This function is called often, and running time should be as small as possible. I don't want to read open and read valid.txt everytime the function is called. The only solution I have come up with is to use a global variable, valid_dict, which is loaded once from valid.txt when utilities.py is imported. The loading code is written as main in utilities.py.
My question is how do I do this without using a global variable, as it is considered bad practice? What is a good design pattern for doing such a task without using globals? Also note again that this is a util file, so there should ideally be no main as such, just functions.
The following is a simple example of a closure. The dictionary, cache, is encapsulated within the outer function (load_func), but remains in scope of the inner, even when it is returned. Notice that load_func returns the inner function as an object, it does not call it.
In utilities.py:
def _load_func(filename):
cache = {}
with open(filename) as fn:
for line in fn:
key, value = line.split()
cache[int(key)] = value
def inner(number):
return number in cache
return inner
is_valid = _load_func('valid.txt')
In __main__:
from utilities import is_valid # or something similar
if is_valid(42):
print(42, 'is valid')
else:
print(42, 'is not valid')
The dictionary (cache) creation could have been done using a dictionary comprehension, but I wanted you to concentrate on the closure.
The variable valid_dict would not be global but local to utilities.py. It would only become global if you did something like from utilities import *. Now that is considered bad practice when you're developing a package.
However, I have used a trick in cases like this that essentially requires a static variable: Add an argument valid_dict={} to is_valid(). This dictionary will be instantiated only once and each time the function is called the same dict is available in valid_dict.
def is_valid(number, valid_dict={}):
if not valid_dict:
# first call to is_valid: load valid.txt into valid_dict
# do your check
Do NOT assign to valid_dict in the if-clause but only modify it: e.g., by setting keys valid_dict[x] = y or using something like valid_dict.update(z).
(PS: Let me know if this is considered "dirty" or "un-pythonic".)

Modify *existing* variable in `locals()` or `frame.f_locals`

I have found some vaguely related questions to this question, but not any clean and specific solution for CPython. And I assume that a "valid" solution is interpreter specific.
First the things I think I understand:
locals() gives a non-modifiable dictionary.
A function may (and indeed does) use some kind of optimization to access its local variables
frame.f_locals gives a locals() like dictionary, but less prone to hackish things through exec. Or at least I have been less able to do hackish undocumented things like the locals()['var'] = value ; exec ""
exec is capable to do weird things to the local variables, but it is not reliable --e.g. I read somewhere that it doesn't work in Python 3. Haven't tested.
So I understand that, given those limitations, it will never be safe to add extra variables to the locals, because it breaks the interpreter structure.
However, it should be possible to change a variable already existing, isn't it?
Things that I considered
In a function f, one can access the f.func_code.co_nlocals and f.func_code.co_varnames.
In a frame, the variables can be accessed / checked / read through the frame.f_locals. This is in the use case of setting a tracer through sys.settrace.
One can easily access the function in which a frame is --cosidering the use case of setting a trace and using it to "do things" in with the local variables given a certain trigger or whatever.
The variables should be somewhere, preferably writeable... but I am not capable of finding it. Even if it is an array (for interpreter efficient access), or I need some extra C-specific wiring, I am ready to commit to it.
How can I achieve that modification of variables from a tracer function or from a decorated wrapped function or something like that?
A full solution will be of course appreciated, but even some pointers will help me greatly, because I'm stuck here with lots of non writeable dictionaries :-/
Edit: Hackish exec is doing things like this or this
It exists an undocumented C-API call for doing things like that:
PyFrame_LocalsToFast
There is some more discussion in this PyDev blog post. The basic idea seems to be:
import ctypes
...
frame.f_locals.update({
'a': 'newvalue',
'b': other_local_value,
})
ctypes.pythonapi.PyFrame_LocalsToFast(
ctypes.py_object(frame), ctypes.c_int(0))
I have yet to test if this works as expected.
Note that there might be some way to access the Fast directly, to avoid an indirection if the requirements is only modification of existing variable. But, as this seems to be mostly non-documented API, source code is the documentation resource.
Based on the notes from MariusSiuram, I wrote a recipe that show the behavior.
The conclusions are:
we can modify an existing variable
we can delete an existing variable
we can NOT add a new variable.
So, here is the code:
import inspect
import ctypes
def parent():
a = 1
z = 'foo'
print('- Trying to add a new variable ---------------')
hack(case=0) # just try to add a new variable 'b'
print(a)
print(z)
assert a == 1
assert z == 'foo'
try:
print (b)
assert False # never is going to reach this point
except NameError, why:
print("ok, global name 'b' is not defined")
print('- Trying to remove an existing variable ------')
hack(case=1)
print(a)
assert a == 2
try:
print (z)
except NameError, why:
print("ok, we've removed the 'z' var")
print('- Trying to update an existing variable ------')
hack(case=2)
print(a)
assert a == 3
def hack(case=0):
frame = inspect.stack()[1][0]
if case == 0:
frame.f_locals['b'] = "don't work"
elif case == 1:
frame.f_locals.pop('z')
frame.f_locals['a'] += 1
else:
frame.f_locals['a'] += 1
# passing c_int(1) will remove and update variables as well
# passing c_int(0) will only update
ctypes.pythonapi.PyFrame_LocalsToFast(
ctypes.py_object(frame),
ctypes.c_int(1))
if __name__ == '__main__':
parent()
The output would be like:
- Trying to add a new variable ---------------
1
foo
ok, global name 'b' is not defined
- Trying to remove an existing variable ------
2
foo
- Trying to update an existing variable ------
3

Call a user defined function given its name as a string input to another function Python 2.7

I'm new to Python and using Anaconda (editor: Spyder) to write some simple functions. I've created a collection of 20 functions and saved them in separate .py files (file names are the same as function names).
For example
def func1(X)
Y=...
return Y
I have another function that takes as input a function name as string (one of those 20 functions), calls it, does some calculations and return the output.
def Main(String,X)
Z=...
W=String(Z)
V=...
return V
How can I choose the function based on string input?
More details:
The Main function calculates the Sobol Indices of a given function. I write the Main function. My colleagues write their own functions (each might be more than 500 lines of codes) and just want to use Main to get the Sobol indices. I will give Main to other people so I do NOT know what Main will get as a function in the future. I also do not want the user of Main to go through the trouble of making a dictionary.
Functions are objects in Python. This means you can store them in dictionaries. One approach is to dispatch the function calls by storing the names you wish to call as keys and the functions as values.
So for example:
import func1, func2
operation_dispatcher = {
"func1": getattr(func1, "func1"),
"func2": getattr(func2, "func2"),
}
def something_calling_funcs(func_name, param):
"""Calls func_name with param"""
func_to_call = operation_dispatcher.get(func_name, None)
if func_to_call:
func_to_call(param)
Now it might be possible to generate the dispatch table more automatically with something like __import__ but there might be a better design in this case (perhaps consider reorganizing your imports).
EDIT took me a minute to fully test this because I had to set up a few files, you can potentially do something like this if you have a lot of names to import and don't want to have to specify each one manually in the dictionary:
import importlib
func_names = ["func1", "func2"]
operation_dispatch = {
name : getattr(importlib.import_module(name), name)
for name in func_names}
#usage
result = operation_dispatch[function_name](param)
Note that this assumes that the function names and module names are the same. This uses importlib to import the module names from the strings provided in func_names here.
You'll just want to have a dictionary of functions, like this:
call = {
"func1": func1,
"functionOne": func1,
"func2": func2,
}
Note that you can have multiple keys for the same function if necessary and the name doesn't need to match the function exactly, as long as the user enters the right key.
Then you can call this function like this:
def Main(String,X)
Z=...
W=call[String](Z)
V=...
return V
Though I recommend catching an error when the user fails to enter a valid key.
def Main(String,X)
Z=...
try:
W=call[String](Z)
except KeyError:
raise(NameError, String + " is not a valid function key")
V=...
return V

Computing a function name from another function name

In python 3.4, I want to be able to do a very simple dispatch table for testing purposes. The idea is to have a dictionary with the key being a string of the name of the function to be tested and the data item being the name of the test function.
For example:
myTestList = (
"myDrawFromTo",
"myDrawLineDir"
)
myTestDict = {
"myDrawFromTo": test_myDrawFromTo,
"myDrawLineDir": test_myDrawLineDir
}
for myTest in myTestList:
result = myTestDict[myTest]()
The idea is that I have a list of function names someplace. In this example, I manually create a dictionary that maps those names to the names of test functions. The test function names are a simple extension of the function name. I'd like to compute the entire dictionary from the list of function names (here it is myTestList).
Alternately, if I can do the same thing without the dictionary, that'd be fine, too. I tried just building a new string from the entries in myTestList and then using local() to set up the call, but didn't have any luck. The dictionary idea came from the Python 3.x documentation.
There are two parts to the problem.
The easy part is just prefixing 'text_' onto each string:
tests = {test: 'test_'+test for test in myTestDict}
The harder part is actually looking up the functions by name. That kind of thing is usually a bad idea, but you've hit on one of the cases (generating tests) where it often makes sense. You can do this by looking them up in your module's global dictionary, like this:
tests = {test: globals()['test_'+test] for test in myTestList}
There are variations on the same idea if the tests live somewhere other than the module's global scope. For example, it might be a good idea to make them all methods of a class, in which case you'd do:
tester = TestClass()
tests = {test: getattr(tester, 'test_'+test) for test in myTestList}
(Although more likely that code would be inside TestClass, so it would be using self rather than tester.)
If you don't actually need the dict, of course, you can change the comprehension to an explicit for statement:
for test in myTestList:
globals()['test_'+test]()
One more thing: Before reinventing the wheel, have you looked at the testing frameworks built into the stdlib, or available on PyPI?
Abarnert's answer seems to be useful but to answer your original question of how to call all test functions for a list of function names:
def test_f():
print("testing f...")
def test_g():
print("testing g...")
myTestList = ['f', 'g']
for funcname in myTestList:
eval('test_' + funcname + '()')

Categories

Resources