How to write super long dictionaries cleaner in function arguments?

How to write super long dictionaries cleaner in function arguments? - python

I am using Argh in Python 3.6 to create a complex command-line function, but because of my deep configuration file, getting a default value for an argument in the function takes a long string of dictionary keys.
This does not look particularly readable because there is a dictionary value as a key of another dictionary. It could get even more nested than
this.
There can be more arguments with default values like this, so keeping this up would get even more confusing soon. This is and example with just one default argument:
import argh
import config
#arg('-v', '--version')
def generate(
kind,
version=config.template[config.data['default']['template']]['default']['version']):
return ['RETURN.', kind, version]
The version argument default value is retrieved from my config module that generates a lot of data in list and dictionary formats.
To try and better explain the default value:
config.template[ # dictionary containing variables for a particular template
config.data['default']['template'] # the default template name set in the main configuration
]['default']['version'] # The default version variable within that particular template
What do you recommend to keep this more readable?

I'd just use the same trick used for mutable default values. This gives you more room to write something more readable.
#arg('-v', '--version')
def generate(kind, version=None):
if version is None:
d = config.data['default']['template']
version = config.template[d]['default']['version']
return ['RETURN.', kind, version]
One drawback is that this is techinically different, as the data in config.data (or any of the dicts) could change between when the function is defined and when it is run. You can do the dict lookups once before the function is defined to mitigate that.
# Choose whatever refactoring looks good to you
default_template = config.data['default']['template']
default_version = config.template[default_template]['default']['version']
#arg('-v', '--version')
def generate(kind, version=default_version):
return ['RETURN.', kind, version]
del default_template default_version # Optional

Why do it on one line:
default_template_id = config.data['default']['template']
default_template = config.template[default_template_id]
default_version = default_template['default']['version']
def generate(kind, version=default_version):
return ['RETURN.', kind, version]

Related

Access default keyword argument value within a decorator [duplicate]

For this function
def eat_dog(name, should_digest=True):
print "ate dog named %s. Digested, too? %" % (name, str(should_digest))
I want to, external to the function, read its arguments and any default values attached. So for this specific example, I want to know that name has no default value (i.e. that it is a required argument) and that True is the default value for should_digest.
I'm aware of inspect.getargspec(), which does give me information about arguments and default values, but I see no connection between the two:
ArgSpec(args=['name', 'should_digest'], varargs=None, keywords=None, defaults=(True,))
From this output how can I tell that True (in the defaults tuple) is the default value for should_digest?
Additionally, I'm aware of the "ask for forgiveness" model of approaching a problem, but unfortunately output from that error won't tell me the name of the missing argument:
>>> eat_dog()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: eat_dog() takes at least 1 argument (0 given)
To give context (why I want to do this), I'm exposing functions in a module over a JSON API. If the caller omits certain function arguments, I want to return a specific error that names the specific function argument that was omitted. If a client omits an argument, but there's a default provided in the function signature, I want to use that default.

Python3.x
In a python3.x world, you should probably use a Signature object:
import inspect
def get_default_args(func):
signature = inspect.signature(func)
return {
k: v.default
for k, v in signature.parameters.items()
if v.default is not inspect.Parameter.empty
}
Python2.x (old answer)
The args/defaults can be combined as:
import inspect
a = inspect.getargspec(eat_dog)
zip(a.args[-len(a.defaults):],a.defaults)
Here a.args[-len(a.defaults):] are the arguments with defaults values and obviously a.defaults are the corresponding default values.
You could even pass the output of zip to the dict constructor and create a mapping suitable for keyword unpacking.
looking at the docs, this solution will only work on python2.6 or newer since I assume that inspect.getargspec returns a named tuple. Earlier versions returned a regular tuple, but it would be very easy to modify accordingly. Here's a version which works with older (and newer) versions:
import inspect
def get_default_args(func):
"""
returns a dictionary of arg_name:default_values for the input function
"""
args, varargs, keywords, defaults = inspect.getargspec(func)
return dict(zip(args[-len(defaults):], defaults))
Come to think of it:
return dict(zip(reversed(args), reversed(defaults)))
would also work and may be more intuitive to some people.

Depending on exactly what you need, you might not need the inspect module since you can check the __defaults__ attribute of the function:
>>> eat_dog.__defaults__
(True,)
>>> eat_dog.__code__.co_argcount
2
>>> eat_dog.__code__.co_varnames
('name', 'should_digest')
>>>
>>> eat_dog.__kwdefaults__
>>> eat_dog.__code__.co_kwonlyargcount
0

You can use inspect module with its getargspec function:
inspect.getargspec(func)
Get the names and default values of a Python function’s arguments. A tuple of four things is returned: (args, varargs, keywords, defaults). args is a list of the argument names (it may contain nested lists). varargs and keywords are the names of the * and ** arguments or None. defaults is a tuple of default argument values or None if there are no default arguments; if this tuple has n elements, they correspond to the last n elements listed in args.
See mgilson's answer for exact code on how to retrieve argument names and their default values.

To those looking for a version to grab a specific default parameter with mgilson's answer.
value = signature(my_func).parameters['param_name'].default
Here's a full working version, done in Python 3.8.2
from inspect import signature
def my_func(a, b, c, param_name='apple'):
pass
value = signature(my_func).parameters['param_name'].default
print(value == 'apple') # True

to take care of keyword-only args (and because defaults and kwonlydefaults can be None):
spec = inspect.getfullargspec(func)
defaults = dict(zip(spec.args[::-1], (spec.defaults or ())[::-1]))
defaults.update(spec.kwonlydefaults or {})

You can get this via some of the __dunder__ vars as mentioned by other posts. Putting that into a simple helper function can get you a dictionary of default values.
.__code__.co_varnames: A tuple of all input variables
.__defaults__: A tuple of the default values
It is worth noting that this tuple only incudes the default provided variables which must always be positioned last in the function arguments
You can use these two items to match the last n variables in the .__code__.co_varnames with all the items in the .__defaults__
EDIT Thanks to #griloHBG - Added if statement to prevent exceptions when no defaults are specified.
def my_fn(a, b=2, c='a'):
pass
def get_defaults(fn):
if fn.__defaults__==None:
return {}
return dict(zip(
fn.__code__.co_varnames[-len(fn.__defaults__):],
fn.__defaults__
))
print(get_defaults(my_fn))
Should give:
{'b': 2, 'c': 'a'}

In python, all the arguments with default value come after the arguments without default value. So the mapping should start from the end till you exhaust the default value list. Hence the logic:
dict(zip(reversed(args), reversed(defaults)))
gives the correctly mapped defaults.

How to avoid repetition in a ternary operator assignment?

When fetching a number of config values from os.environ, it's nice to have defaults in the python code to easily allow the application to start in a number of contexts.
A typical django settings.py has a number of
SOME_SETTING = os.environ.get('SOME_SETTING')
lines.
To provide sensible defaults we opted for
SOME_SETTING = os.environ.get('SOME_SETTING') or "theValue"
However, this is error prone because calling the application with
SOME_SETTING=""
manage.py
will lead SOME_SETTING to be set to theValue instead of the explicitly defined ""
Is there a way to assign values in python using the ternary a = b if b else d without repeating b or assigning it to a shorthand variable before?
this becomes obvious if we look at
SOME_VERY_LONG_VAR_NAME = os.environ.get('SOME_VERY_LONG_VAR_NAME') if os.environ.get('SOME_VERY_LONG_VAR_NAME') else 'meh'
It would be much nicer to be able to do something like
SOME_VERY_LONG_VAR_NAME = if os.environ.get('SOME_VERY_LONG_VAR_NAME') else 'meh'

Just like Python's built-in mapping class dict, os.environ.get has a second argument, and it seems like you want it:
SOME_SETTING = os.environ.get('SOME_SETTING', "theValue")
This is the same as
try:
SOME_SETTING = os.environ['SOME_SETTING']
except KeyError:
SOME_SETTING = "theValue"

If you read dict.get()'s doc, you'll find out the method's signature is get(self, key, default=None). The default argument is what gets returned if the key is not found in the dict (and default to a sensible None). So you can use this second argument instead of doing an erroneous boolean test:
SOME_SETTING = os.environ.get('SOME_SETTING', "theValue")

How do I programmatically add new functions to current scope in Python?

In Python it is easy to create new functions programmatically. How would I assign this to programmatically determined names in the current scope?
This is what I'd like to do (in non-working code):
obj_types = ('cat', 'dog', 'donkey', 'camel')
for obj_type in obj_types:
'create_'+obj_type = lambda id: id
In the above example, the assignment of lambda into a to-be-determined function name obviously does not work. In the real code, the function itself would be created by a function factory.
The background is lazyness and do-not-repeat-yourself: I've got a dozen and more object types for which I'd assign a generated function. So the code currently looks like:
create_cat = make_creator('cat')
# ...
create_camel = make_creator('camel')
The functions create_cat etc are used hardcoded in a parser.
If I would create classes as a new type programmatically, types.new_class() as seen in the docs seems to be the solution.
Is it my best bet to (mis)use this approach?

One way to accomplish what you are trying to do (but not create functions with dynamic names) is to store the lamda's in a dict using the name as the key. Instead of calling create_cat() you would call create['cat'](). That would dovetail nicely with not hardcoding names in the parser logic as well.

Vaughn Cato points out that one could just assign into locals()[object_type] = factory(object_type). However the Python docs prohibit this: "Note: The contents of this dictionary should not be modified; changes may not affect the values of local and free variables used by the interpreter"
D. Shawley points out that it would be wiser to use a dict() object which entries would hold the functions. Access would be simple by using create['cat']() in the parser. While this is compelling I do not like the syntax overhead of the brackets and ticks required.
J.F. Sebastian points to classes. And this is what I ended up with:
# Omitting code of these classes for clarity
class Entity:
def __init__(file_name, line_number):
# Store location, good for debug, messages, and general indexing
# The following classes are the real objects to be generated by a parser
# Their constructors must consume whatever data is provided by the tokens
# as well as calling super() to forward the file_name,line_number info.
class Cat(Entity): pass
class Camel(Entity): pass
class Parser:
def parse_file(self, fn):
# ...
# Function factory to wrap object constructor calls
def create_factory(obj_type):
def creator(text, line_number, token):
try:
return obj_type(*token,
file_name=fn, line_number=line_number)
except Exception as e:
# For debug of constructor during development
print(e)
return creator
# Helper class, serving as a 'dictionary' of obj construction functions
class create: pass
for obj_type in (Cat, Camel):
setattr(create,
obj_type.__name__.lower(),
create_factory(obj_type))
# Parsing code now can use (again simplified for clarity):
expression = Keyword('cat').setParseAction(create.cat)
This is helper code for deploying a pyparsing parser. D. Shawley is correct in that the dict would actually more easily allow to dynamically generate the parser grammar.

Passing optional arguments from optparse

I'm trying to figure out how to pass optional arguments from optparse. The problem I'm having is if an optparse option is not specified, it defaults to a None type, but if I pass the None type into a function, it yells at me instead of using the default (Which is understandable and valid).
conn = psycopg2.connect(database=options.db, hostname=options.hostname, port=options.port)
The question is, how do I use the function's defaults for optional arguments but still pass in user inputs if there is an input without having a huge number of if statements.

Define a function remove_none_values that filters a dictionary for none-valued arguments.
def remove_none_values(d):
return dict((k,v) for (k,v) in d.iteritems() if not v is None)
kwargs = {
'database': options.db,
'hostname': options.hostname,
...
}
conn = psycopg2.connect(**remove_none_values(kwargs))
Or, define a function wrapper that removes none values before passing the data on to the original function.
def ignore_none_valued_kwargs(f):
#functools.wraps(f)
def wrapper(*args, **kwargs):
newkwargs = dict((k,v) for (k,v) in d.iteritems() if not v is None)
return f(*args, **kwargs)
return wrapper
my_connect = ignore_none_valued_kwargs(psycopg2)
conn = my_connect(database=options.db, hostname=options.hostname, port=options.port)

The opo module of my thebops package (pip install thebops, https://bitbucket.org/therp/thebops) contains an add_optval_option function.
This uses an additional keyword argument empty which specifies the value to use if the option is used without a value. If one of the option strings is found in the commandline, this value is injected into the argument list.
This is still hackish, but at least it is made a simple-to-use function ...
It works well under the following circumstances:
The argument vector does already exist when the option is created. This is usually true.
All programs I found which sport arguments with optional values require the given value to be attached as --option=value or -ovalue rather than --option value or -o value.
Maybe I'll tweak thebops.optparse to support the empty argument as well; but I'd like to have a test suite first to prevent regressions, preferably the original Optik / optparse tests.
This is the code:
from sys import argv
def add_optval_option(pog, *args, **kwargs):
"""
Add an option which can be specified without a value;
in this case, the value (if given) must be contained
in the same argument as seen by the shell,
i.e.:
--option=VALUE, --option will work;
--option VALUE will *not* work
Arguments:
pog -- parser or group
empty -- the value to use when used without a value
Note:
If you specify a short option string as well, the syntax given by the
help will be wrong; -oVALUE will be supported, -o VALUE will not!
Thus it might be wise to create a separate option for the short
option strings (in a "hidden" group which isn't added to the parser after
being populated) and just mention it in the help string.
"""
if 'empty' in kwargs:
empty_val = kwargs.pop('empty')
# in this case it's a good idea to have a <default> value; this can be
# given by another option with the same <dest>, though
for i in range(1, len(argv)):
a = argv[i]
if a == '--':
break
if a in args:
argv.insert(i+1, empty_val)
break
pog.add_option(*args, **kwargs)

creating variables from external data in python script

I want to read an external data source (excel) and create variables containing the data. Suppose the data is in columns and each column has a header with the variable name.
My first idea is to write a function so i can easily reuse it. Also, I could easily give some additional keyword arguments to make the function more versatile.
The problem I'm facing is that I want to refer to the data in python (interactively) via the variable names. I don't know how to do that (with a function). The only solution I see is returning the variable names and the data from my function (eg as lists), and do something like this:
def get_data()
(...)
return names, values
names, values = get_data(my_excel)
for n,v in zip(names, values):
exec(''.join([n, '= v']))
Can I get the same result directly?
Thanks,
Roel

Use a dictionary to store your mapping from name to value instead of creating local variable.
def get_data(excel_document):
mapping = {}
mapping['name1'] = 'value1'
# ...
return mapping
mapping = get_data(my_excel)
for name, value in mapping:
# use them
If you really want to populate variables from the mapping, you can modify globals() (or locals()), but it is generally considered bad practice.
mapping = get_data(my_excel)
globals().update(mapping)

If you just want to set local variables for each name in names, use:
for n, v in zip(names, values):
locals()[n] = v
If you'd rather like to have a single object to access the data, which is much cleaner, simply use a dict, and return that from your function.
def get_data():
(...)
return dict(zip(names, values))
To access the value of the name "a", simply use get_data()["a"].
Finally, if you want to access the data as attributes of an object, you can update the __dict__ of an object (unexpected behaviour may occur if any of your column names are equal to any special python methods).
class Data(object):
def __init__(self, my_excel):
(...)
self.__dict__.update(zip(names, values))
data = Data("test.xls")
print data.a

The traditional approach would be to stuff the key/value pairs into a dict so that you can easily pass the whole structure around to other functions. If you really want to store them as attributes instead of dict keys, consider creating a class to hold them:
class Values(object): pass
store = Values()
for key, value in zip(names, values):
setattr(store, key, value)
That keeps the variables in their own namespace, separate from your running code. That's almost always a Good Thing. What if you get a spreadsheet with a header called "my_excel"? Suddenly you've lost access to your original my_excel object, which would be very inconvenient if you needed it again.
But in any case, you should never use exec unless you know exactly what you're doing. And even then, don't use exec. For instance, I know how your code works and send you a spreadsheet with "os.system('echo rm -rf *')" in a cell. You probably don't really want to execute that.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to write super long dictionaries cleaner in function arguments? - python

Why do it on one line: default_template_id = config.data['default']['template'] default_template = config.template[default_template_id] default_version = default_template['default']['version'] def generate(kind, version=default_version): return ['RETURN.', kind, version]

Related

Access default keyword argument value within a decorator [duplicate]

How to avoid repetition in a ternary operator assignment?

How do I programmatically add new functions to current scope in Python?

Passing optional arguments from optparse

creating variables from external data in python script

Categories

Resources