make shorter os.getenv in Python - python

import os
BUCKET = os.getenv("BUCKET")
IN_CSV = os.getenv("IN_CSV")
OUT_CSV = os.getenv("OUT_CSV")
now, you see the problem right? I don't want to retype the variable name twice, is there a way to not do it? maybe some function get_and_init_env.
get_and_init_env(BUCKET) after this is executed there should be a variable of name BUCKET with value os.getenv("BUCKET") in locals()

May not be exactly what you need but to save time typing in things built on ipython, I once made a class that took in a dict of strings (such as one that can easily be made from os.environ), and in it's __init__ it called setattr to make itself have attributes that reflected the dict contents. From there I just had to .blah that instance instead of ['blah'] but more importantly in ipython could .b<tab> and bring up the items it could be. Probably went something like
...
class DotDict:
def __init__(self,dictish):
self._original = dict(dictish) #a dict has a lot of useful capabilities that can be routed to it...
for x,y in self._original.items():
setattr(self,CleanStr(x),y)
...
...
#make useful dicts part of the module
env =DotDict(os.environ)
...
from MyMod import env as env0
env0.BUCKET #just use it...
Since most environ vars should be pretty clean, you can probably just use x instead of CleanStr(x) but should really have a way to make any x object into a valid name, be it str or repr or hash related and prefixed by some favorite character sequence.

Related

Use function parameter to construct name of object or dataframe

I would like to use a function's parameter to create dynamic names of dataframes and/or objects in Python. I have about 40 different names so it would be really elegant to do this in a function. Is there a way to do this or do I need to do this via 'dict'? I read that 'exec' is dangerous (not that I could get this to work). SAS has this feature for their macros which is where I am coming from. Here is an example of what I am trying to do (using '#' for illustrative purposes):
def TrainModels (mtype):
model_#mtype = ExtraTreesClassifier()
model_#mtype.fit(X_#mtype, Y_#mtype)
TrainModels ('FirstModel')
TrainModels ('SecondModel')
You could use a dictionary for this:
models = {}
def TrainModels (mtype):
models[mtype] = ExtraTreesClassifier()
models[mtype].fit()
First of all, any name you define within your TrainModels function will be local to that function, so won't be accessible in the rest of your program. So you have to define a global name.
Everything in Python is a dictionary, including the global namespace. You can define a new global name dynamically as follows:
my_name = 'foo'
globals()[my_name] = 'bar'
This is terrible and you should never do it. It adds too much indirection to your code. When someone else (or yourself in 3 months when the code is no longer fresh in your mind) reads the code and see 'foo' used elsewhere, they'll have a hard time figuring out where it came from. Code analysis tools will not be able to help you.
I would use a dict as Milkboat suggested.

Avoid global variable in a python util file

I have a utilities.py file for my python project. It contains only util functions, for example is_float(string), is_empty(file), etc.
Now I want to have a function is_valid(number), which has to:
read from a file, valid.txt, which contains all numbers which are valid, and load them onto a map/set.
check the map for the presence of number and return True or False.
This function is called often, and running time should be as small as possible. I don't want to read open and read valid.txt everytime the function is called. The only solution I have come up with is to use a global variable, valid_dict, which is loaded once from valid.txt when utilities.py is imported. The loading code is written as main in utilities.py.
My question is how do I do this without using a global variable, as it is considered bad practice? What is a good design pattern for doing such a task without using globals? Also note again that this is a util file, so there should ideally be no main as such, just functions.
The following is a simple example of a closure. The dictionary, cache, is encapsulated within the outer function (load_func), but remains in scope of the inner, even when it is returned. Notice that load_func returns the inner function as an object, it does not call it.
In utilities.py:
def _load_func(filename):
cache = {}
with open(filename) as fn:
for line in fn:
key, value = line.split()
cache[int(key)] = value
def inner(number):
return number in cache
return inner
is_valid = _load_func('valid.txt')
In __main__:
from utilities import is_valid # or something similar
if is_valid(42):
print(42, 'is valid')
else:
print(42, 'is not valid')
The dictionary (cache) creation could have been done using a dictionary comprehension, but I wanted you to concentrate on the closure.
The variable valid_dict would not be global but local to utilities.py. It would only become global if you did something like from utilities import *. Now that is considered bad practice when you're developing a package.
However, I have used a trick in cases like this that essentially requires a static variable: Add an argument valid_dict={} to is_valid(). This dictionary will be instantiated only once and each time the function is called the same dict is available in valid_dict.
def is_valid(number, valid_dict={}):
if not valid_dict:
# first call to is_valid: load valid.txt into valid_dict
# do your check
Do NOT assign to valid_dict in the if-clause but only modify it: e.g., by setting keys valid_dict[x] = y or using something like valid_dict.update(z).
(PS: Let me know if this is considered "dirty" or "un-pythonic".)

Computing a function name from another function name

In python 3.4, I want to be able to do a very simple dispatch table for testing purposes. The idea is to have a dictionary with the key being a string of the name of the function to be tested and the data item being the name of the test function.
For example:
myTestList = (
"myDrawFromTo",
"myDrawLineDir"
)
myTestDict = {
"myDrawFromTo": test_myDrawFromTo,
"myDrawLineDir": test_myDrawLineDir
}
for myTest in myTestList:
result = myTestDict[myTest]()
The idea is that I have a list of function names someplace. In this example, I manually create a dictionary that maps those names to the names of test functions. The test function names are a simple extension of the function name. I'd like to compute the entire dictionary from the list of function names (here it is myTestList).
Alternately, if I can do the same thing without the dictionary, that'd be fine, too. I tried just building a new string from the entries in myTestList and then using local() to set up the call, but didn't have any luck. The dictionary idea came from the Python 3.x documentation.
There are two parts to the problem.
The easy part is just prefixing 'text_' onto each string:
tests = {test: 'test_'+test for test in myTestDict}
The harder part is actually looking up the functions by name. That kind of thing is usually a bad idea, but you've hit on one of the cases (generating tests) where it often makes sense. You can do this by looking them up in your module's global dictionary, like this:
tests = {test: globals()['test_'+test] for test in myTestList}
There are variations on the same idea if the tests live somewhere other than the module's global scope. For example, it might be a good idea to make them all methods of a class, in which case you'd do:
tester = TestClass()
tests = {test: getattr(tester, 'test_'+test) for test in myTestList}
(Although more likely that code would be inside TestClass, so it would be using self rather than tester.)
If you don't actually need the dict, of course, you can change the comprehension to an explicit for statement:
for test in myTestList:
globals()['test_'+test]()
One more thing: Before reinventing the wheel, have you looked at the testing frameworks built into the stdlib, or available on PyPI?
Abarnert's answer seems to be useful but to answer your original question of how to call all test functions for a list of function names:
def test_f():
print("testing f...")
def test_g():
print("testing g...")
myTestList = ['f', 'g']
for funcname in myTestList:
eval('test_' + funcname + '()')

Changing variables in separate Python files

So I just found out today that I can import variables from other Python files. For example, I can have a variable green = 1, and have a completely separate file use that variable. I've found that this is really helpful for functions.
So, here's my question. I'm sorry if the title didn't help very much, I wasn't entirely sure what this would be called.
I want my program to ask the player what his or her name is. Once the player has answered, I want that variable to be stored in a separate Python file, "player_variables.py", so every time I want to say the player's name, instead of having to go to from game import name, I can use from player_variables import name, just to make it easier.
I fully understand that this is a lazy man's question, but I'm just trying to learn as much as I could. I'm still very new, and I'm sorry if this question is ridiculous. :).
Thanks for the help, I appreciate it. (be nice to me!)
From your question, I think you're confusing some ideas about variables and the values of variables.
1) Simply creating a python file with variable names allows you to access the values of those variables defined therein. Example:
# myvariables.py
name = 'Steve'
# main.py
import myvariables
print myvariables.name
Since this requires that the variables themselves be defined as global, this solution is not ideal. See this link for more: https://docs.python.org/2/reference/simple_stmts.html#global
2) However, as your question states that you want your program to save the variable of some user-entered value in the python file, that is another thing entirely as that's metaprogramming in python - as your program is outputting Python code, in this case a file with the same name as your variables python file.
From the above, if this is the means by which you're accessing and updating variables values, it is a simpler idea to have a config file which can be loaded or changed at whim. Metaprogramming is only necessary when you need it.
I think you should use a data format for storing data, not a data manipulating programming language for this:
import json
import myconfig
def load_data():
with open(myconfig.config_loc, 'r') as json_data:
return json.loads(json_data.read())
def save_data(name, value):
existing_data = load_data()
existing_data[name] = value
with open(myconfig.config_loc, 'w') as w:
w.write(json.dumps(existing_data, indent=4, sort_keys=True))
You should only store the location of this json file in your myvariables.py file, but I'd call it something else (in this case, myconfig, or just config).
More on the json library can be found here.

Scope, using functions in current module

I know this must be a trivial question, but I've tried many different ways, and searched quie a bit for a solution, but how do I create and reference subfunctions in the current module?
For example, I am writing a program to parse through a text file, and for each of the 300 different names in it, I want to assign to a category.
There are 300 of these, and I have a list of these structured to create a dict, so of the form lookup[key]=value (bonus question; any more efficient or sensible way to do this than a massive dict?).
I would like to keep all of this in the same module, but with the functions (dict initialisation, etc) at the
end of the file, so I dont have to scroll down 300 lines to see the code, i.e. as laid out as in the example below.
When I run it as below, I get the error 'initlookups is not defined'. When I structure is so that it is initialisation, then function definition, then function use, no problem.
I'm sure there must be an obvious way to initialise the functions and associated dict without keeping the code inline, but have tried quite a few so far without success. I can put it in an external module and import this, but would prefer not to for simplicity.
What should I be doing in terms of module structure? Is there any better way than using a dict to store this lookup table (It is 300 unique text keys mapping on to approx 10 categories?
Thanks,
Brendan
import ..... (initialisation code,etc )
initLookups() # **Should create the dict - How should this be referenced?**
print getlookup(KEY) # **How should this be referenced?**
def initLookups():
global lookup
lookup={}
lookup["A"]="AA"
lookup["B"]="BB"
(etc etc etc....)
def getlookup(value)
if name in lookup.keys():
getlookup=lookup[name]
else:
getlookup=""
return getlookup
A function needs to be defined before it can be called. If you want to have the code that needs to be executed at the top of the file, just define a main function and call it from the bottom:
import sys
def main(args):
pass
# All your other function definitions here
if __name__ == '__main__':
exit(main(sys.argv[1:]))
This way, whatever you reference in main will have been parsed and is hence known already. The reason for testing __name__ is that in this way the main method will only be run when the script is executed directly, not when it is imported by another file.
Side note: a dict with 300 keys is by no means massive, but you may want to either move the code that fills the dict to a separate module, or (perhaps more fancy) store the key/value pairs in a format like JSON and load it when the program starts.
Here's a more pythonic ways to do this. There aren't a lot of choices, BTW.
A function must be defined before it can be used. Period.
However, you don't have to strictly order all functions for the compiler's benefit. You merely have to put your execution of the functions last.
import # (initialisation code,etc )
def initLookups(): # Definitions must come before actual use
lookup={}
lookup["A"]="AA"
lookup["B"]="BB"
(etc etc etc....)
return lookup
# Any functions initLookups uses, can be define here.
# As long as they're findable in the same module.
if __name__ == "__main__": # Use comes last
lookup= initLookups()
print lookup.get("Key","")
Note that you don't need the getlookup function, it's a built-in feature of a dict, named get.
Also, "initialisation code" is suspicious. An import should not "do" anything. It should define functions and classes, but not actually provide any executable code. In the long run, executable code that is processed by an import can become a maintenance nightmare.
The most notable exception is a module-level Singleton object that gets created by default. Even then, be sure that the mystery object which makes a module work is clearly identified in the documentation.
If your lookup dict is unchanging, the simplest way is to just make it a module scope variable. ie:
lookup = {
'A' : 'AA',
'B' : 'BB',
...
}
If you may need to make changes, and later re-initialise it, you can do this in an initialisation function:
def initLookups():
global lookup
lookup = {
'A' : 'AA',
'B' : 'BB',
...
}
(Alternatively, lookup.update({'A':'AA', ...}) to change the dict in-place, affecting all callers with access to the old binding.)
However, if you've got these lookups in some standard format, it may be simpler simply to load it from a file and create the dictionary from that.
You can arrange your functions as you wish. The only rule about ordering is that the accessed variables must exist at the time the function is called - it's fine if the function has references to variables in the body that don't exist yet, so long as nothing actually tries to use that function. ie:
def foo():
print greeting, "World" # Note that greeting is not yet defined when foo() is created
greeting = "Hello"
foo() # Prints "Hello World"
But:
def foo():
print greeting, "World"
foo() # Gives an error - greeting not yet defined.
greeting = "Hello"
One further thing to note: your getlookup function is very inefficient. Using "if name in lookup.keys()" is actually getting a list of the keys from the dict, and then iterating over this list to find the item. This loses all the performance benefit the dict gives. Instead, "if name in lookup" would avoid this, or even better, use the fact that .get can be given a default to return if the key is not in the dictionary:
def getlookup(name)
return lookup.get(name, "")
I think that keeping the names in a flat text file, and loading them at runtime would be a good alternative. I try to stick to the lowest level of complexity possible with my data, starting with plain text and working up to a RDMS (I lifted this idea from The Pragmatic Programmer).
Dictionaries are very efficient in python. It's essentially what the whole language is built on. 300 items is well within the bounds of sane dict usage.
names.txt:
A = AAA
B = BBB
C = CCC
getname.py:
import sys
FILENAME = "names.txt"
def main(key):
pairs = (line.split("=") for line in open(FILENAME))
names = dict((x.strip(), y.strip()) for x,y in pairs)
return names.get(key, "Not found")
if __name__ == "__main__":
print main(sys.argv[-1])
If you really want to keep it all in one module for some reason, you could just stick a string at the top of the module. I think that a big swath of text is less distracting than a huge mess of dict initialization code (and easier to edit later):
import sys
LINES = """
A = AAA
B = BBB
C = CCC
D = DDD
E = EEE""".strip().splitlines()
PAIRS = (line.split("=") for line in LINES)
NAMES = dict((x.strip(), y.strip()) for x,y in PAIRS)
def main(key):
return NAMES.get(key, "Not found")
if __name__ == "__main__":
print main(sys.argv[-1])

Categories

Resources