In Python it is easy to create new functions programmatically. How would I assign this to programmatically determined names in the current scope?
This is what I'd like to do (in non-working code):
obj_types = ('cat', 'dog', 'donkey', 'camel')
for obj_type in obj_types:
'create_'+obj_type = lambda id: id
In the above example, the assignment of lambda into a to-be-determined function name obviously does not work. In the real code, the function itself would be created by a function factory.
The background is lazyness and do-not-repeat-yourself: I've got a dozen and more object types for which I'd assign a generated function. So the code currently looks like:
create_cat = make_creator('cat')
# ...
create_camel = make_creator('camel')
The functions create_cat etc are used hardcoded in a parser.
If I would create classes as a new type programmatically, types.new_class() as seen in the docs seems to be the solution.
Is it my best bet to (mis)use this approach?
One way to accomplish what you are trying to do (but not create functions with dynamic names) is to store the lamda's in a dict using the name as the key. Instead of calling create_cat() you would call create['cat'](). That would dovetail nicely with not hardcoding names in the parser logic as well.
Vaughn Cato points out that one could just assign into locals()[object_type] = factory(object_type). However the Python docs prohibit this: "Note: The contents of this dictionary should not be modified; changes may not affect the values of local and free variables used by the interpreter"
D. Shawley points out that it would be wiser to use a dict() object which entries would hold the functions. Access would be simple by using create['cat']() in the parser. While this is compelling I do not like the syntax overhead of the brackets and ticks required.
J.F. Sebastian points to classes. And this is what I ended up with:
# Omitting code of these classes for clarity
class Entity:
def __init__(file_name, line_number):
# Store location, good for debug, messages, and general indexing
# The following classes are the real objects to be generated by a parser
# Their constructors must consume whatever data is provided by the tokens
# as well as calling super() to forward the file_name,line_number info.
class Cat(Entity): pass
class Camel(Entity): pass
class Parser:
def parse_file(self, fn):
# ...
# Function factory to wrap object constructor calls
def create_factory(obj_type):
def creator(text, line_number, token):
try:
return obj_type(*token,
file_name=fn, line_number=line_number)
except Exception as e:
# For debug of constructor during development
print(e)
return creator
# Helper class, serving as a 'dictionary' of obj construction functions
class create: pass
for obj_type in (Cat, Camel):
setattr(create,
obj_type.__name__.lower(),
create_factory(obj_type))
# Parsing code now can use (again simplified for clarity):
expression = Keyword('cat').setParseAction(create.cat)
This is helper code for deploying a pyparsing parser. D. Shawley is correct in that the dict would actually more easily allow to dynamically generate the parser grammar.
Related
I am trying to write a testing program for a python program that takes data, does calculations on it, then puts the output in a class instance object. This object contains several other objects, each with their own attributes. I'm trying to access all the attributes and sub-attributes dynamically with a one size fits all solution, corresponding to elements in a dictionary I wrote to cycle through and get all those attributes for printing onto a test output file.
Edit: this may not be clear from the above but I have a list of the attributes I want, so using something to actually get those attributes is not a problem, although I'm aware python has methods that accomplish this. What I need to do is to be able to get all of those attributes with the same function call, regardless of whether they are top level object attributes or attributes of object attributes.
Python is having some trouble with this - first I tried doing something like this:
for string in attr_dictionary:
...
outputFile.print(outputclass.string)
...
But Python did not like this, and returned an AttributeError
After checking SE, I learned that this is a supposed solution:
for string in attr_dictionary:
...
outputFile.print(getattr(outputclass, string))
...
The only problem is - I want to dynamically access the attributes of objects that are attributes of outputclass. So ideally it would be something like outputclass.objectAttribute.attribute, but this does not work in python. When I use getattr(outputclass, objectAttribute.string), python returns an AttributeError
Any good solution here?
One thing I have thought of trying is creating methods to return those sub-attributes, something like:
class outputObject:
...
def attributeIWant(self,...):
return self.subObject.attributeIWant
...
Even then, it seems like getattr() will return an error because attributeIWant() is supposed to be a function call, it's not actually an attribute. I'm not certain that this is even within the capabilities of Python to make this happen.
Thank you in advance for reading and/or responding, if anyone is familiar with a way to do this it would save me a bunch of refactoring or additional code.
edit: Additional Clarification
The class for example is outputData, and inside that class you could have and instance of the class furtherData, which has the attribute dataIWant:
class outputData:
example: furtherData
example = furtherData()
example.dataIWant = someData
...
with the python getattr I can't access both attributes directly in outputData and attributes of example unless I use separate calls, the attribute of example needs two calls to getattr.
Edit2: I have found a solution I think works for this, see below
I was able to figure this out - I just wrote a quick function that splits the attribute string (for example outputObj.subObj.propertyIWant) then proceeds down the resultant array, calling getattr on each subobject until it reaches the end of the array and returns the actual attribute.
Code:
def obtainAttribute(sample, attributeString: str):
baseObj = sample
attrArray = attributeString.split(".")
for string in attrArray:
if(attrArray.index(string) == (len(attrArray) - 1)):
return getattr(baseObj,string)
else:
baseObj = getattr(baseObj,string)
return "failed"
sample is the object and attributeString is, for example object.subObject.attributeYouWant
Do I have to formally define a function before I can use it as an element of a dictionary?
def my_func():
print 'my_func'
d = {
'function': my_func
}
I would rather define the function inline. I just tried to type out what I want to do, but the whitespace policies of python syntax make it very hard to define an inline func within a dict. Is there any way to do this?
The answer seems to be that there is no way to declare a function inline a dictionary definition in python. Thanks to everyone who took the time to contribute.
Do you really need a dictionary, or just getitem access?
If the latter, then use a class:
>>> class Dispatch(object):
... def funcA(self, *args):
... print('funcA%r' % (args,))
... def funcB(self, *args):
... print('funcB%r' % (args,))
... def __getitem__(self, name):
... return getattr(self, name)
...
>>> d = Dispatch()
>>>
>>> d['funcA'](1, 2, 3)
funcA(1, 2, 3)
You could use a decorator:
func_dict = {}
def register(func):
func_dict[func.__name__] = func
return func
#register
def a_func():
pass
#register
def b_func():
pass
The func_dict will end up mapping using the entire name of the function:
>>> func_dict
{'a_func': <function a_func at 0x000001F6117BC950>, 'b_func': <function b_func at 0x000001F6117BC8C8>}
You can modify the key used by register as desired. The trick is that we use the __name__ attribute of the function to get the appropriate string.
Consider using lambdas, but note that lambdas can only consist of one expression and cannot contain statements (see http://docs.python.org/reference/expressions.html#lambda).
e.g.
d = { 'func': lambda x: x + 1 }
# call d['func'](2) will return 3
Also, note that in Python 2, print is not a function. So you have to do either:
from __future__ import print_function
d = {
'function': print
}
or use sys.stdout.write instead
d = {
'function': sys.stdout.write
}
Some functions can be easily 'inlined' anonymously with lambda expressions, e.g.:
>>> d={'function': lambda x : x**2}
>>> d['function'](5)
25
But for anything semi-complex (or using statements) you probably just should define them beforehand.
There is no good reason to want to write this using a dictionary in Python. It's strange and is not a common way to namespace functions.
The the Python philosophies that apply here are:
There should be one-- and preferably only one --obvious way to do it.
Combined with
Readability counts.
Doing it this way also makes things hard to understand and read for the typical Python user.
The good things the dictionary does in this case is map strings to functions and namespace them within a dictionary, but this functionality is already provided by both modules and classes and it's much easier to understand by those familiar with Python.
Examples:
Module method:
#cool.py
def cool():
print 'cool'
Now use the module like you would be using your dict:
import cool
#cool.__dict__['cool']()
#update - to the more correct idiom vars
vars(cool)['cool']()
Class method:
class Cool():
def cool():
print 'cool'
#Cool.__dict__['cool']()
#update - to the more correct idiom vars
vars(Cool)['cool']()
Edit after comment below:
argparse seems like a good fit for this problem, so you don't have to reinvent the wheel. If you do decide to implement it completely yourself though argparse source should give you some good direction. Anyways the sections below seem to apply to this use case:
15.4.4.5. Beyond sys.argv
Sometimes it may be useful to have an ArgumentParser parse arguments
other than those of sys.argv. This can be accomplished by passing a
list of strings to parse_args(). This is useful for testing at the
interactive prompt:
15.4.5.1. Sub-commands¶
ArgumentParser.add_subparsers()
Many programs split up their functionality into a number of sub-commands, for example, the svn program can invoke sub-commands
like svn checkout, svn update, and svn commit.
15.4.4.6. The Namespace object
It may also be useful to have an ArgumentParser assign attributes to
an already existing object, rather than a new Namespace object. This
can be achieved by specifying the namespace= keyword argument:
Update, here's an example using argparse
strategizer = argparse.ArgumentParser()
strat_subs = strategizer.add_subparsers()
math = strat_subs.add_parser('math')
math_subs = math.add_subparsers()
math_max = math_subs.add_parser('max')
math_sum = math_subs.add_parser('sum')
math_max.set_defaults(strategy=max)
math_sum.set_defaults(strategy=sum)
strategizer.parse_args('math max'.split())
Out[46]: Namespace(strategy=<built-in function max>)
strategizer.parse_args('math sum'.split())
Out[47]: Namespace(strategy=<built-in function sum>)
I would like to note the reasons I would recommend argparse
Mainly the requirement to use strings that represent options and sub options to map to functions.
It's dead simple (after getting past the feature filled argparse module).
Uses a Python Standard Library Module. This let's others familiar with Python grok what your doing without getting into implementation details, and is very well documented for those who aren't.
Many extra features could be taken advantage of out of the box (not the best reason!).
Using argparse and Strategy Pattern together
For the plain and simple implementation of the Strategy Pattern, this has already been answered very well.
How to write Strategy Pattern in Python differently than example in Wikipedia?
#continuing from the above example
class MathStudent():
def do_math(self, numbers):
return self.strategy(numbers)
maximus = strategizer.parse_args('math max'.split(),
namespace=MathStudent())
sumera = strategizer.parse_args('math sum'.split(),
namespace=MathStudent())
maximus.do_math([1, 2, 3])
Out[71]: 3
sumera.do_math([1, 2, 3])
Out[72]: 6
The point of inlining functions is to blur the distinction between dictionaries and class instances. In javascript, for example, this techinque makes it very pleasant to write control classes that have little reusability. Also, and very helpfully the API then conforms to the well-known dictionary protocols, being self explanatory (pun intended).
You can do this in python - it just doesn't look like a dictionary! In fact, you can use the class keyword in ANY scope (i.e. a class def in a function, or a class def inside of a class def), and it's children can be the dictonary you are looking for; just inspect the attributes of a definition as if it was a javascript dictionary.
Example as if it was real:
somedict = {
"foo":5,
"one_function":your method here,
"two_function":your method here,
}
Is actually accomplished as
class somedict:
foo = 5
#classmethod
def one_method(self):
print self.foo
self.foo *= 2;
#classmethod
def two_method(self):
print self.foo
So that you can then say:
somedict.foo #(prints 5)
somedict.one_method() #(prints 5)
somedict.two_method() #(prints 10)
And in this way, you get the same logical groupings as you would with your "inlining".
In python 3.4, I want to be able to do a very simple dispatch table for testing purposes. The idea is to have a dictionary with the key being a string of the name of the function to be tested and the data item being the name of the test function.
For example:
myTestList = (
"myDrawFromTo",
"myDrawLineDir"
)
myTestDict = {
"myDrawFromTo": test_myDrawFromTo,
"myDrawLineDir": test_myDrawLineDir
}
for myTest in myTestList:
result = myTestDict[myTest]()
The idea is that I have a list of function names someplace. In this example, I manually create a dictionary that maps those names to the names of test functions. The test function names are a simple extension of the function name. I'd like to compute the entire dictionary from the list of function names (here it is myTestList).
Alternately, if I can do the same thing without the dictionary, that'd be fine, too. I tried just building a new string from the entries in myTestList and then using local() to set up the call, but didn't have any luck. The dictionary idea came from the Python 3.x documentation.
There are two parts to the problem.
The easy part is just prefixing 'text_' onto each string:
tests = {test: 'test_'+test for test in myTestDict}
The harder part is actually looking up the functions by name. That kind of thing is usually a bad idea, but you've hit on one of the cases (generating tests) where it often makes sense. You can do this by looking them up in your module's global dictionary, like this:
tests = {test: globals()['test_'+test] for test in myTestList}
There are variations on the same idea if the tests live somewhere other than the module's global scope. For example, it might be a good idea to make them all methods of a class, in which case you'd do:
tester = TestClass()
tests = {test: getattr(tester, 'test_'+test) for test in myTestList}
(Although more likely that code would be inside TestClass, so it would be using self rather than tester.)
If you don't actually need the dict, of course, you can change the comprehension to an explicit for statement:
for test in myTestList:
globals()['test_'+test]()
One more thing: Before reinventing the wheel, have you looked at the testing frameworks built into the stdlib, or available on PyPI?
Abarnert's answer seems to be useful but to answer your original question of how to call all test functions for a list of function names:
def test_f():
print("testing f...")
def test_g():
print("testing g...")
myTestList = ['f', 'g']
for funcname in myTestList:
eval('test_' + funcname + '()')
I just started building a text based game yesterday as an exercise in learning Python (I'm using 3.3). I say "text based game," but I mean more of a MUD than a choose-your-own adventure. Anyway, I was really excited when I figured out how to handle inheritance and multiple inheritance using super() yesterday, but I found that the argument-passing really cluttered up the code, and required juggling lots of little loose variables. Also, creating save files seemed pretty nightmarish.
So, I thought, "What if certain class hierarchies just took one argument, a dictionary, and just passed the dictionary back?" To give you an example, here are two classes trimmed down to their init methods:
class Actor:
def __init__(self, in_dict,**kwds):
super().__init__(**kwds)
self._everything = in_dict
self._name = in_dict["name"]
self._size = in_dict["size"]
self._location = in_dict["location"]
self._triggers = in_dict["triggers"]
self._effects = in_dict["effects"]
self._goals = in_dict["goals"]
self._action_list = in_dict["action list"]
self._last_action = ''
self._current_action = '' # both ._last_action and ._current_action get updated by .update_action()
class Item(Actor):
def __init__(self,in_dict,**kwds)
super().__init__(in_dict,**kwds)
self._can_contain = in_dict("can contain") #boolean entry
self._inventory = in_dict("can contain") #either a list or dict entry
class Player(Actor):
def __init__(self, in_dict,**kwds):
super().__init__(in_dict,**kwds)
self._inventory = in_dict["inventory"] #entry should be a Container object
self._stats = in_dict["stats"]
Example dict that would be passed:
playerdict = {'name' : '', 'size' : '0', 'location' : '', 'triggers' : None, 'effects' : None, 'goals' : None, 'action list' = None, 'inventory' : Container(), 'stats' : None,}
(The None's get replaced by {} once the dictionary has been passed.)
So, in_dict gets passed to the previous class instead of a huge payload of **kwds.
I like this because:
It makes my code a lot neater and more manageable.
As long as the dicts have at least some entry for the key called, it doesn't break the code. Also, it doesn't matter if a given argument never gets used.
It seems like file IO just got a lot easier (dictionaries of player data stored as dicts, dictionaries of item data stored as dicts, etc.)
I get the point of **kwds (EDIT: apparently I didn't), and it hasn't seemed cumbersome when passing fewer arguments. This just appears to be a comfortable way of dealing with a need for a large number of attributes at the the creation of each instance.
That said, I'm still a major python noob. So, my question is this: Is there an underlying reason why passing the same dict repeatedly through super() to the base class would be a worse idea than just toughing it out with nasty (big and cluttered) **kwds passes? (e.g. issues with the interpreter that someone at my level would be ignorant of.)
EDIT:
Previously, creating a new Player might have looked like this, with an argument passed for each attribute.
bob = Player('bob', Location = 'here', ... etc.)
The number of arguments needed blew up, and I only included the attributes that really needed to be present to not break method calls from the Engine object.
This is the impression I'm getting from the answers and comments thus far:
There's nothing "wrong" with sending the same dictionary along, as long as nothing has the opportunity to modify its contents (Kirk Strauser) and the dictionary always has what it's supposed to have (goncalopp). The real answer is that the question was amiss, and using in_dict instead of **kwds is redundant.
Would this be correct? (Also, thanks for the great and varied feedback!)
I'm not sure I understand your question exactly, because I don't see how the code looked before you made the change to use in_dict. It sounds like you have been listing out dozens of keywords in the call to super (which is understandably not what you want), but this is not necessary. If your child class has a dict with all of this information, it can be turned into kwargs when you make the call with **in_dict. So:
class Actor:
def __init__(self, **kwds):
class Item(Actor):
def __init__(self, **kwds)
self._everything = kwds
super().__init__(**kwds)
I don't see a reason to add another dict for this, since you can just manipulate and pass the dict created for kwds anyway
Edit:
As for the question of the efficiency of using the ** expansion of the dict versus listing the arguments explicitly, I did a very unscientific timing test with this code:
import time
def some_func(**kwargs):
for k,v in kwargs.items():
pass
def main():
name = 'felix'
location = 'here'
user_type = 'player'
kwds = {'name': name,
'location': location,
'user_type': user_type}
start = time.time()
for i in range(10000000):
some_func(**kwds)
end = time.time()
print 'Time using expansion:\t{0}s'.format(start - end)
start = time.time()
for i in range(10000000):
some_func(name=name, location=location, user_type=user_type)
end = time.time()
print 'Time without expansion:\t{0}s'.format(start - end)
if __name__ == '__main__':
main()
Running this 10,000,000 times gives a slight (and probably statistically meaningless) advantage passing around a dict and using **.
Time using expansion: -7.9877269268s
Time without expansion: -8.06108212471s
If we print the IDs of the dict objects (kwds outside and kwargs inside the function), you will see that python creates a new dict for the function to use in either case, but in fact the function only gets one dict forever. After the initial definition of the function (where the kwargs dict is created) all subsequent calls are just updating the values of that dict belonging to the function, no matter how you call it. (See also this enlightening SO question about how mutable default parameters are handled in python, which is somewhat related)
So from a performance perspective, you can pick whichever makes sense to you. It should not meaningfully impact how python operates behind the scenes.
I've done that myself where in_dict was a dict with lots of keys, or a settings object, or some other "blob" of something with lots of interesting attributes. That's perfectly OK if it makes your code cleaner, particularly if you name it clearly like settings_object or config_dict or similar.
That shouldn't be the usual case, though. Normally it's better to explicitly pass a small set of individual variables. It makes the code much cleaner and easier to reason about. It's possible that a client could pass in_dict = None by accident and you wouldn't know until some method tried to access it. Suppose Actor.__init__ didn't peel apart in_dict but just stored it like self.settings = in_dict. Sometime later, Actor.method comes along and tries to access it, then boom! Dead process. If you're calling Actor.__init__(var1, var2, ...), then the caller will raise an exception much earlier and provide you with more context about what actually went wrong.
So yes, by all means: feel free to do that when it's appropriate. Just be aware that it's not appropriate very often, and the desire to do it might be a smell telling you to restructure your code.
This is not python specific, but the greatest problem I can see with passing arguments like this is that it breaks encapsulation. Any class may modify the arguments, and it's much more difficult to tell which arguments are expected in each class - making your code difficult to understand, and harder to debug.
Consider explicitly consuming the arguments in each class, and calling the super's __init__ on the remaining. You don't need to make them explicit:
class ClassA( object ):
def __init__(self, arg1, arg2=""):
pass
class ClassB( ClassA ):
def __init__(self, arg3, arg4="", *args, **kwargs):
ClassA.__init__(self, *args, **kwargs)
ClassB(3,4,1,2)
You can also leave the variables uninitialized and use methods to set them. You can then use different methods in the different classes, and all subclasses will have access to the superclass methods.
Lets say I have a program that has a large number of configuration options. The user can specify them in a config file. My program can parse this config file, but how should it internally store and pass around the options?
In my case, the software is used to perform a scientific simulation. There are about 200 options most of which have sane defaults. Typically the user only has to specify a dozen or so. The difficulty I face is how to design my internal code. Many of the objects that need to be constructed depend on many configuration options. For example an object might need several paths (for where data will be stored), some options that need to be passed to algorithms that the object will call, and some options that are used directly by the object itself.
This leads to objects needing a very large number of arguments to be constructed. Additionally, as my codebase is under very active development, it is a big pain to go through the call stack and pass along a new configuration option all the way down to where it is needed.
One way to prevent that pain is to have a global configuration object that can be freely used anywhere in the code. I don't particularly like this approach as it leads to functions and classes that don't take any (or only one) argument and it isn't obvious to the reader what data the function/class deals with. It also prevents code reuse as all of the code depends on a giant config object.
Can anyone give me some advice about how a program like this should be structured?
Here is an example of what I mean for the configuration option passing style:
class A:
def __init__(self, opt_a, opt_b, ..., opt_z):
self.opt_a = opt_a
self.opt_b = opt_b
...
self.opt_z = opt_z
def foo(self, arg):
algo(arg, opt_a, opt_e)
Here is an example of the global config style:
class A:
def __init__(self, config):
self.config = config
def foo(self, arg):
algo(arg, config)
The examples are in Python but my question stands for any similar programming langauge.
matplotlib is a large package with many configuration options. It use a rcParams module to manage all the default parameters. rcParams save all the default parameters in a dict.
Every functions will get the options from keyword argurments:
for example:
def f(x,y,opt_a=None, opt_b=None):
if opt_a is None: opt_a = rcParams['group1.opt_a']
A few design patterns will help
Prototype
Factory and Abstract Factory
Use these two patterns with configuration objects. Each method will then take a configuration object and use what it needs. Also consider applying a logical grouping to config parameters and think about ways to reduce the number of inputs.
psuedo code
// Consider we can run three different kinds of Simulations. sim1, sim2, sim3
ConfigFactory configFactory = new ConfigFactory("/path/to/option/file");
....
Simulation1 sim1;
Simulation2 sim2;
Simulation3 sim3;
sim1.run( configFactory.ConfigForSim1() );
sim2.run( configFactory.ConfigForSim2() );
sim3.run( configFactory.ConfigForSim3() );
Inside of each factory method it might create a configuration from a prototype object (that has all of the "sane" defaults) and the option file becomes just the things that are different from default. This would be paired with clear documentation on what these defaults are and when a person (or other program) might want to change them.
** Edit: **
Also consider that each config returned by the factory is a subset of the overall config.
Pass around either the config parsing class, or write a class that wraps it and intelligently pulls out the requested options.
Python's standard library configparser exposes the sections and options of an INI style configuration file using the mapping protocol, and so you can retrieve your options directly from that as though it were a dictionary.
myconf = configparser.ConfigParser()
myconf.read('myconf.ini')
what_to_do = myconf['section']['option']
If you explicitly want to provide the options using the attribute notation, create a class that overrides __getattr__:
class MyConf:
def __init__(self, path):
self._parser = configparser.ConfigParser()
self._parser.read('myconf.ini')
def __getattr__(self, option):
return self._parser[{'what_to_do': 'section'}[option]][option]
myconf = MyConf()
what_to_do = myconf.what_to_do
Have a module load the params to its namespace, then import it and use wherever you want.
Also see related question here