Passing around parsed arguments makes no fun - python

I use arparse in Python to parse arguments from the command line:
def main():
parser = argparse.ArgumentParser(usage=usage)
parser.add_argument('-v', '--verbose', dest='verbose', action='store_true')
parser.add_argument(...)
args = parser.parse_args()
I use object args only in few places of the code.
There are three methods and the call stack looks like this
def first_level(args):
second_level()
def second_level():
third_level()
def third_level():
### here I want to add some logging if args.verbose is True
I want to add some logging to third_level().
I don't like to change the signature of the method second_level().
How can I make the arg object available in third_lelvel()?
I could store arg as global variable, but I was told to not use global variables in a developer training some years ago ....
What is common way to handle this?

Converting my comment to answer. I'd suggest not condition in your third_level(..) at all. There are mechanisms to let the logging module take care of that -- and those mechanisms can be controlled from outside of those 3 functions.
Something like:
def first_level(args):
second_level()
def second_level():
third_level()
def third_level():
logging.info("log line which will be printed if logging is at INFO level")
def main():
args = ....
#Set the logging level, conditionally
if args.verbose:
logging.basicConfig(filename='myapp.log', level=logging.INFO)
else:
logging.basicConfig(filename='myapp.log', level=logging.WARNING)
first_level(args)

A common module structure is something like:
imports ...
<constants>
options = {verbose=0, etc}
# alt options = argparse.Namespace(logging=False,....)
def levelone(args, **kwargs):
....
def leveltwo(...):
def levelthree(...):
<use constant>
<use options>
def parser():
p = argparse.ArgumentParser()
....
args = p.parse_args() # this uses sys.argv
if __name__=='__main__':
args = parser()
options.update(vars(args))
levelone(args)
The body of the module has function definitions, and can be imported by another module. If used as a script then parser reads the commandline. That global options is available for all sorts of state like parameters. In sense they are are constants that the user, or importing module, can tweak. Values imported from a config file can have the same role.
Another common pattern is to make your functions methods of a class, and pass args as object attributes.
class Foo():
def __init__(self, logging=False):
self.logging = logging
def levelone():
def leveltwo():
<use self.logging>
foo(args.logging).levelone()
While globals are discouraged, it's more because they get overused, and spoil the modularity that functions provide. But Python also provides a module level namespace that can contain more than just functions and classes. And any function defined in the module can access that namespace - unless its own definitions shadow it.
var1 = 'module level variable'
var2 = 'another'
def foo(var3):
x = var1 # read/use var1
var2 = 1 # shadow the module level definition
etc
================
I'm not sure I should recommend this or not, but you could parse sys.argv within third_level.
def third_level():
import argparse
p = argparse.ArgumentParser()
p.add_argument('-v','--verbose',action='count')
args = p.parse_known_args()
verbose = args.verbose
<logging>
argparse imports sys and uses sys.argv. It can do that regardless of whether it is use at your script level, in your main or some nested function. logging does the same sort of thing. You could use your own imported module to covertly pass values into functions. Obviously that can be abused. A class with class attributes can also be used this way.

Related

what is the best way to mock variables in __init__ method python class

I have the following class, which uses a function to load config values from a file. load_config essentially converts the property in question in the yaml into a string.
import load_config
import sys
class Example:
def __init__(self):
parser = ArgumentParser()
self.args = parser.parse_known_args(sys.argv[1:])
self.conf = self.args + {"val_b": "green"}
self.val_a = load_config(self.conf, "val_a")
self.val_b = load_config(self.conf, "val_b")
if not self.val_a:
raise ValeError()
if not self.val_b:
raise ValeError()
self.args loads arguments from the CLI. Imagine, for this example, self.conf is a merge of CLI args and hard coded args into a python dict.
I'm trying to test against the if conditions. How can I patch or pass fake values to self.val_a and self.val_b so that I can raise the exceptions and have a test case against it?
Since val_a comes from CLI, I can patch sys.argv as patch("sys.argv", self.cli_arguments), but how can I patch or is it possible to pass a fake value to self.val_b during instantiation?
I don't want to patch the call to load_config because I want to run separate tests that confirm the if statements get triggered.

adding command line arguments to multiple scripts in python

I have a use case where I have a main python script with many command line arguments, I need to break it's functionality into multiple smaller scripts, a few command-line arguments will be common to more than one smaller scripts. I want to reduce code duplicacy. I tried to use decorators to register each argument to one or more scripts, but am not able to get around an error. Another caveat I have is I want to set default values for shared argument according to which script is being run. This is what I have currently
argument_parser.py
import argparse
import functools
import itertools
from scripts import Scripts
from collections import defaultdict
_args_register = defaultdict(list)
def argument(scope):
"""
Decorator to add argument to argument registry
:param scope: The module name to register current argument function to can also be a list of modules
:return: The decorated function after after adding it to registry
"""
def register(func):
if isinstance(scope, Scripts):
_args_register[scope].append(func)
elif isinstance(scope, list) and Scripts.ALL in scope:
_args_register[Scripts.ALL].append(func)
else:
for module in scope:
_args_register[module].append(func)
return func
return register
class ArgumentHandler:
def __init__(self, script, parser=None):
self._parser = parser or argparse.ArgumentParser(description=__doc__)
assert script in Scripts
self._script = script
#argument(scope=Scripts.ALL)
def common_arg(self):
self._parser.add_arg("--common-arg",
default=self._script,
help="An arg common to all scripts")
#argument(scope=[Scripts.TRAIN, Scripts.TEST])
def train_test_arg(self):
self._parser.add_arg("--train-test-arg",
default=self._script,
help=f"An arg common to train-test scripts added in argument handler"
)
def parse_args(self):
for argument in itertools.chain(_args_register[Scripts.ALL],
_args_register[self._script]):
argument()
_args = self._parser.parse_args()
return _args
One of the smaller scripts train.py
"""
A Train script to abstract away training tasks
"""
import argparse
from argument_parser import ArgumentHandler
from scripts import Scripts
current = Scripts.TRAIN
parser = argparse.ArgumentParser(description=__doc__)
def get_args() -> argparse.Namespace:
parser.add_argument('--train-arg',
default='blah',
help='a train argumrnt set in the train script')
args_handler = ArgumentHandler(parser=parser, script=current)
return args_handler.parse_args()
if __name__ == '__main__':
print(get_args())
When I run train.py I get the following error
File "../argument_parser.py", line 68, in parse_args
argument()
TypeError: common_arg() missing 1 required positional argument: 'self'
Process finished with exit code 1
I think this is because decorators are run at import time, but am not sure, is there any work around this? or any other better way to reduce code duplicacy? Any help will be highly appreciated. Thanks!

How can I collect unspecified command line options when using Python's `argh`?

With the Python CLI library argh I want to write a wrapper tool. This wrapper tool is suppose to read the two options -a and -b and to pass all other options to a function (which then calls the wrapped UNIX tool with the left-over options via subprocess).
I have experimented with dispatch's parameter skip_unknown_args:
def wrapper(a=True, b=False):
print("Enter wrapper")
# 1. process a and b
# 2. call_unix_tool(left-over-args)
if __name__ == '__main__':
parser = argh.ArghParser()
argh.set_default_command(parser, wrapper)
argh.dispatch(parser, skip_unknown_args=True)
However the program still does exit when it encounters unknown options and it does not enter the function wrapper as needed. Additionally I don't know where the unknown/skipped arguments are stored, so that I can pass them to the UNIX tool.
How can I tell argh to go into wrapper with the skipped arguments?
I believe this is a bug.
when skip_unknown_args=True, here namespace_obj is a tuple, with a namespace object and remaining args:
(Pdb) p namespace_obj
(ArghNamespace(_functions_stack=[<function wrapper at 0x105cb5e18>], a=False, b=True), ['-c'])
underlying _get_function_from_namespace_obj expects an unary one:
154 function = _get_function_from_namespace_obj(namespace_obj)
...
191 if isinstance(namespace_obj, ArghNamespace):
I checked its coressponding issue and unittest, no idea what is the legitmate behivour the author expects, have dropped a comment there as well.
why not use argparse directly?
You cannot do this with skip_unknown_args=True, because as #georgexsh pointed out the argh library doesn't seem to behave sensibly with that option. However you can provide your own parser class which injects the unknown arguments into the normal namespace:
class ArghParserWithUnknownArgs(argh.ArghParser):
def parse_args(self, args=None, namespace=None):
namespace = namespace or ArghNamespace()
(namespace_obj, unknown_args) = super(ArghParserWithUnknownArgs, self).parse_known_args(args=args, namespace=namespace)
namespace_obj.__dict__['unknown_args'] = unknown_args
return namespace_obj
Note that this class' parse_args method calls ArgParser's parse_known_args method!
With this class defined you can write the wrapper code following way:
def wrapper(a=True, b=False, unknown_args={}):
print("a = %s, b = %s" % (a,b))
print("unknown_args = %s" % unknown_args)
if __name__ == '__main__':
parser = ArghParserWithUnknownArgs()
argh.set_default_command(parser, wrapper)
argh.dispatch(parser)
In your main function wrapper you can access all unknown arguments via the parameter unknown_args and pass this on to your subprocess command
ps: In order to keep the help message tidy decorate wrapper with
#argh.arg('--unknown_args', help=argparse.SUPPRESS)
Addendum: I created an enhanced version of the parser and compiled it into a ready-to-use module. Find it on Github.

Can two Python argparse objects be combined?

I have an object A which contains parserA - an argparse.ArgumentParser object
There is also object B which contains parserB - another argparse.ArgumentParser
Object A contains an instance of object B, however object B's arguments now need to be parsed by the parser in object A (since A is the one being called from the command line with the arguments, not B)
Is there a way to write in Python object A: parserA += B.parserB?
argparse was developed around objects. Other than a few constants and utility functions it is all class definitions. The documentation focuses on use rather than that class structure. But it may help to understand a bit of that.
parser = argparse.ArgumentParser(...)
creates a parser object.
arg1 = parser.add_argument(...)
creates an argparse.Action (subclass actually) object and adds it to several parser attributes (lists). Normally we ignore the fact that the method returns this Action object, but occasionally I find it helpful. And when I build a parser in an interactive shell I see a this action.
args = parser.parse_args()
runs another method, and returns an namespace object (class argparse.Namespace).
The group methods and subparsers methods also create and return objects (groups, actions and/or parsers).
The ArgumentParser method takes a parents parameter, where the value is a list of parser objects.
With
parsera = argparse.ArgumentParser(parents=[parserb])
during the creation of parsera, the actions and groups in parserb are copied to parsera. That way, parsera will recognize all the arguments that parserb does. I encourage you to test it.
But there are a few qualifications. The copy is by reference. That is, parsera gets a pointer to each Action defined in parserb. Occasionally that creates problems (I won't get into that now). And one or the other has to have add_help=False. Normally a help action is added to a parser at creation. But if parserb also has a help there will be conflict (a duplication) that has to be resolved.
But parents can't be used if parsera has been created independently of parserb. There's no existing mechanism for adding Actions from parserb. It might possible to make a new parser, with both as parents
parserc = argparse.ArgumentParser(parents=[parsera, parserb])
I could probably write a function that would add arguments from parserb to parsera, borrowing ideas from the method that implements parents. But I'd have to know how conflicts are to be resolved.
Look at the argparse._ActionsContainer._add_container_actions to see how arguments (Actions) are copies from a parent to a parser. Something that may be confusing is that each Action is part of a group (user defined or one of the 2 default groups (seen in the help)) in addition to being in a parser.
Another possibility is to use
[argsA, extrasA] = parserA.parse_known_args()
[argsB, extrasB] = parserB.parse_known_args() # uses the same sys.argv
# or
args = parserB.parse_args(extrasA, namespace=argsA)
With this each parser handles the arguments it knows about, and returns the rest in the extras list.
Unless the parsers are designed for this kind of integration, there will be rough edges with this kind of integration. It may be easier to deal with those conficts with Arnial's approach, which is to put the shared argument definitions in your own methods. Others like to put the argument parameters in some sort of database (list, dictionary, etc), and build the parser from that. You can wrap parser creation in as many layers of boilerplate as you find convenient.
You can't use one ArgumentParser inside another. But there is a way around. You need to extract to method code that add arguments to parser.
Then you will be able to use them to merge arguments in parser.
Also it will be easer to group arguments (related to their parsers). But you must be shore that sets of arguments names do not intersect.
Example:
foo.py:
def add_foo_params( group ):
group.add_argument('--foo', help='foo help')
if __name__ = "__main__":
parser = argparse.ArgumentParser(prog='Foo')
boo.py
def add_boo_params( group ):
group.add_argument('--boo', help='boo help')
if __name__ = "__main__":
parser = argparse.ArgumentParser(prog='Boo')
fooboo.py
from foo import add_foo_params
from boo import add_boo_params
if __name__ = "__main__":
parser = argparse.ArgumentParser(prog='FooBoo')
foo_group = parser.add_argument_group(title="foo params")
boo_group = parser.add_argument_group(title="boo params")
add_foo_params( foo_group )
add_boo_params( boo_group )
For your use case, if you can, you could try simply sharing the same argparse object between classes via a dedicated method.
Below is based on what it seems like your situation is.
import argparse
class B(object):
def __init__(self, parserB=argparse.ArgumentParser()):
super(B, self).__init__()
self.parserB = parserB
def addArguments(self):
self.parserB.add_argument("-tb", "--test-b", help="Test B", type=str, metavar="")
#Add more arguments specific to B
def parseArgs(self):
return self.parserB.parse_args()
class A(object):
def __init__(self, parserA=argparse.ArgumentParser(), b=B()):
super(A, self).__init__()
self.parserA = parserA
self.b = b
def addArguments(self):
self.parserA.add_argument("-ta", "--test-a", help="Test A", type=str, metavar="")
#Add more arguments specific to A
def parseArgs(self):
return self.parserA.parse_args()
def mergeArgs(self):
self.b.parserB = self.parserA
self.b.addArguments()
self.addArguments()
Code Explanation:
As stated, in the question, object A and object B contain their own parser objects. Object A also contains an instance of object B.
The code simply separates the anticipated flow into separate methods so that it is possible to keep adding arguments to a single parser before attempting to parse it.
Test Individual
a = A()
a.addArguments()
print(vars(a.parseArgs()))
# CLI Command
python test.py -ta "Testing A"
# CLI Result
{'test_a': 'Testing A'}
Combined Test
aCombined = A()
aCombined.mergeArgs()
print(vars(aCombined.parseArgs()))
# CLI Command
testing -ta "Testing A" -tb "Testing B"
# CLI Result
{'test_b': 'Testing B', 'test_a': 'Testing A'}
Additional
You can also make a general method that takes variable args, and would iterate over and keep adding the args of various classes. I created class C and D for sample below with a general "parser" attribute name.
Multi Test
# Add method to Class A
def mergeMultiArgs(self, *objects):
parser = self.parserA
for object in objects:
object.parser = parser
object.addArguments()
self.addArguments()
aCombined = A()
aCombined.mergeMultiArgs(C(), D())
print(vars(aCombined.parseArgs()))
# CLI Command
testing -ta "Testing A" -tc "Testing C" -td "Testing D"
# CLI Result
{'test_d': 'Testing D', 'test_c': 'Testing C', 'test_a': 'Testing A'}
Yes they can be combined, do this:
Here is a function that merges two args:
def merge_args_safe(args1: Namespace, args2: Namespace) -> Namespace:
"""
Merges two namespaces but throws an error if there are keys that collide.
ref: https://stackoverflow.com/questions/56136549/how-can-i-merge-two-argparse-namespaces-in-python-2-x
:param args1:
:param args2:
:return:
"""
# - the merged args
# The vars() function returns the __dict__ attribute to values of the given object e.g {field:value}.
args = Namespace(**vars(args1), **vars(args2))
return args
test
def merge_args_test():
args1 = Namespace(foo="foo", collided_key='from_args1')
args2 = Namespace(bar="bar", collided_key='from_args2')
args = merge_args(args1, args2)
print('-- merged args')
print(f'{args=}')
output:
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1483, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/brando/ultimate-utils/ultimate-utils-proj-src/uutils/__init__.py", line 1202, in <module>
merge_args_test()
File "/Users/brando/ultimate-utils/ultimate-utils-proj-src/uutils/__init__.py", line 1192, in merge_args_test
args = merge_args(args1, args2)
File "/Users/brando/ultimate-utils/ultimate-utils-proj-src/uutils/__init__.py", line 1116, in merge_args
args = Namespace(**vars(args1), **vars(args2))
TypeError: argparse.Namespace() got multiple values for keyword argument 'collided_key'
python-BaseException
you can find it in this library: https://github.com/brando90/ultimate-utils
If you want to have collisions resolved do this:
def merge_two_dicts(starting_dict: dict, updater_dict: dict) -> dict:
"""
Starts from base starting dict and then adds the remaining key values from updater replacing the values from
the first starting/base dict with the second updater dict.
For later: how does d = {**d1, **d2} replace collision?
:param starting_dict:
:param updater_dict:
:return:
"""
new_dict: dict = starting_dict.copy() # start with keys and values of starting_dict
new_dict.update(updater_dict) # modifies starting_dict with keys and values of updater_dict
return new_dict
def merge_args(args1: Namespace, args2: Namespace) -> Namespace:
"""
ref: https://stackoverflow.com/questions/56136549/how-can-i-merge-two-argparse-namespaces-in-python-2-x
:param args1:
:param args2:
:return:
"""
# - the merged args
# The vars() function returns the __dict__ attribute to values of the given object e.g {field:value}.
merged_key_values_for_namespace: dict = merge_two_dicts(vars(args1), vars(args2))
args = Namespace(**merged_key_values_for_namespace)
return args
test:
def merge_args_test():
args1 = Namespace(foo="foo", collided_key='from_args1')
args2 = Namespace(bar="bar", collided_key='from_args2')
args = merge_args(args1, args2)
print('-- merged args')
print(f'{args=}')
assert args.collided_key == 'from_args2', 'Error in merge dict, expected the second argument to be the one used' \
'to resolve collision'

Python Help: Accessing static member variable from another class

I'll do my best to describe the issue I am having. I am building a Python program that is built on multiple classes and uses the unittest framework. In a nutshell, the Main.py file has a "ValidateDriver" class that defines a "driver" variable as an ElementTree type. If I point this directly to the XML file I need to parse, (i.e. driver = ElementTree.parse(rC:\test.xml)) then I can access it from another class. However, in reality I don't have the actual XML file that is passed in from the command-line until you get to the Main function in the ValidateDriver class. So under the ValidateDriver class driver would really be driver = ElementTree and then in the main function I would reassign that variable to ValidateDriver.driver = ElementTree.parse(args.driver). However, this is the crux. When I go to the other class and try to call ValidateDriver.driver I don't have the "findall" method/attribute available. Again, the only way it will work is to do something like: ElementTree.parse(rC:\test.xml)). If I did this in C# it would work, but I am new to Python and this is kicking my butt. Any help/suggestions is appreciated. I've included the code for both classes.
Main Function:
import sys
import argparse
import xml.etree.ElementTree as ElementTree
import unittest
import Tests.TestManufacturer
class ValidateDriver:
driver = ElementTree
def main(argv):
parser = argparse.ArgumentParser(description='Validation.')
parser.add_argument('-d', '--driver', help='Path and file name xml file', required=True)
parser.add_argument('-v', '--verbosity',
help='Verbosity for test output. 1 for terse, 2 for verbose. Default is verbose',
default=2, type=int)
#args = parser.parse_args()
args = r'C:\test.c4i'
#print ("Validate Driver: %s" % args.driver)
#print ("Verbosity Level: %s" % args.verbosity)
ValidateDriver.driver = ElementTree.parse(r'C:\test.c4i')
loader = unittest.TestLoader()
suite = loader.loadTestsFromModule(Tests.TestManufacturer)
runner = unittest.TextTestRunner(verbosity=2) # TODO Remove this...
# TODO Uncomment this...
runner = unittest.TextTestRunner(verbosity=args.verbosity)
result = runner.run(suite)
if __name__ == "__main__":
main(sys.argv[1:])
Other Class, Test Manufacturer:
import unittest
import Main
manufacturer = ['']
class Tests(unittest.TestCase):
# Test to see if Manufacturer exists.
def test_manufacturer_exists(self):
for m in Main.ValidateDriver.driver.findall('./manufacturer'):
print m.text
Producing the following error:
C:\Python27\python.exe C:\Users\test\PycharmProjects\Validator\Main.py
Traceback (most recent call last):
File "C:\Users\test\PycharmProjects\Validator\Main.py", line 22, in <module>
class ValidateDriver:
File "C:\Users\test\PycharmProjects\Validator\Main.py", line 65, in ValidateDriver
main(sys.argv[1:])
File "C:\Users\test\PycharmProjects\Validator\Main.py", line 36, in main
ValidateDriver.driver = ElementTree.parse(r'C:\test.c4i')
NameError: global name 'ValidateDriver' is not defined
Process finished with exit code 1
The main problem seems to be that your main script is wrapped in a class. There's really no reason for this, and is quite confusing.
if __name__ == "__main__":
main_object = ValidateDriver()
main_object.main(sys.argv[1:])
This should go outside the class definition
This has nothing to do with "findall" being available. The issue is that the class itself hasn't yet been completely declared at the time you try to access it. In python, the file is read top to bottom. For example, this is not allowed:
if __name__ == "__main__":
f()
def f():
...
The call to f must happen at the bottom of the file after it is declared.
What you're doing with ValidateDriver is similar, because the class isn't defined until the statements directly in its body are executed (this is different from functions, whose bodies of course aren't executed until they are called). You call main(sys.argv[1:]) inside the class body, which in turn tries to access ValidateDriver.driver, which doesn't exist yet.
Preferably, the main function, as well as the code which calls it, should be outside the class. As far as I can tell, the class doesn't need to exist at all (this isn't C# or Java -- you can put code directly at the module level without a class container). If you insist on putting it in a class as a static method, it must be defined as a class method:
#classmethod
def main(cls, argv):
...
which can then be called (outside the class definition) like:
ValidateDriver.main(sys.argv[1:])
But I stress that this is non-standard and should not be necessary.

Categories

Resources