I have a use case where I have a main python script with many command line arguments, I need to break it's functionality into multiple smaller scripts, a few command-line arguments will be common to more than one smaller scripts. I want to reduce code duplicacy. I tried to use decorators to register each argument to one or more scripts, but am not able to get around an error. Another caveat I have is I want to set default values for shared argument according to which script is being run. This is what I have currently
argument_parser.py
import argparse
import functools
import itertools
from scripts import Scripts
from collections import defaultdict
_args_register = defaultdict(list)
def argument(scope):
"""
Decorator to add argument to argument registry
:param scope: The module name to register current argument function to can also be a list of modules
:return: The decorated function after after adding it to registry
"""
def register(func):
if isinstance(scope, Scripts):
_args_register[scope].append(func)
elif isinstance(scope, list) and Scripts.ALL in scope:
_args_register[Scripts.ALL].append(func)
else:
for module in scope:
_args_register[module].append(func)
return func
return register
class ArgumentHandler:
def __init__(self, script, parser=None):
self._parser = parser or argparse.ArgumentParser(description=__doc__)
assert script in Scripts
self._script = script
#argument(scope=Scripts.ALL)
def common_arg(self):
self._parser.add_arg("--common-arg",
default=self._script,
help="An arg common to all scripts")
#argument(scope=[Scripts.TRAIN, Scripts.TEST])
def train_test_arg(self):
self._parser.add_arg("--train-test-arg",
default=self._script,
help=f"An arg common to train-test scripts added in argument handler"
)
def parse_args(self):
for argument in itertools.chain(_args_register[Scripts.ALL],
_args_register[self._script]):
argument()
_args = self._parser.parse_args()
return _args
One of the smaller scripts train.py
"""
A Train script to abstract away training tasks
"""
import argparse
from argument_parser import ArgumentHandler
from scripts import Scripts
current = Scripts.TRAIN
parser = argparse.ArgumentParser(description=__doc__)
def get_args() -> argparse.Namespace:
parser.add_argument('--train-arg',
default='blah',
help='a train argumrnt set in the train script')
args_handler = ArgumentHandler(parser=parser, script=current)
return args_handler.parse_args()
if __name__ == '__main__':
print(get_args())
When I run train.py I get the following error
File "../argument_parser.py", line 68, in parse_args
argument()
TypeError: common_arg() missing 1 required positional argument: 'self'
Process finished with exit code 1
I think this is because decorators are run at import time, but am not sure, is there any work around this? or any other better way to reduce code duplicacy? Any help will be highly appreciated. Thanks!
Related
I would like to create a python file that can be run from the terminal - this file will be in charge of running various other python files depending on the functionality required along with their required arguments, respectively. For example, this is the main file:
import sys
from midi_to_audio import arguments, run
files = ["midi_to_audio.py"]
def main(file, args):
if file == "midi_to_audio.py":
if len(args) != arguments:
print("Incorrect argument length")
else:
run("test","t")
if __name__ == '__main__':
sys.argv.pop(0)
file = sys.argv[0]
sys.argv.pop(0)
if file not in files:
print("File does not exist")
else:
main(file, sys.argv)
And this is the first file used in the example (midi_to_audio.py):
arguments = 2
def run(file, output_file):
print("Ran this method")
So depending on which file I've specified when running the cmd via the terminal, it will go into a different file and call its run method. If the arguments are not as required in each file, it will not run
For example: >python main.py midi_to_audio.py file_name_here output_name_here
My problem is that, as I add more files with their own "arguments" and "run" functions, I wonder if python is going to get confused with which arguments or which run function to execute. Is there a more safer/generic way of doing this?
Also, is there a way of getting the names of the python files depending on which files I've imported? Because for now I have to import the file and manually add their file name to the files list in main.py
Your runner could look like this, to load a module by name and check it has run, and check the arguments given on the command line, and finally dispatch to the module's run function.
import sys
import importlib
def main():
args = sys.argv[1:]
if len(args) < 1:
raise Exception("No module name given")
module_name = args.pop(0).removesuffix(".py") # grab the first argument and remove the .py suffix
module = importlib.import_module(module_name) # import a module by name
if not hasattr(module, 'run'): # check if the module has a run function
raise Exception(f"Module {module_name} does not have a run function")
arg_count = getattr(module, 'arguments', 0) # get the number of arguments the module needs
if len(args) != arg_count:
raise Exception(f"Module {module_name} requires {arg_count} arguments, got {len(args)}")
module.run(*args)
if __name__ == '__main__':
main()
This works with the midi_to_audio.py module in your post.
I have the following class, which uses a function to load config values from a file. load_config essentially converts the property in question in the yaml into a string.
import load_config
import sys
class Example:
def __init__(self):
parser = ArgumentParser()
self.args = parser.parse_known_args(sys.argv[1:])
self.conf = self.args + {"val_b": "green"}
self.val_a = load_config(self.conf, "val_a")
self.val_b = load_config(self.conf, "val_b")
if not self.val_a:
raise ValeError()
if not self.val_b:
raise ValeError()
self.args loads arguments from the CLI. Imagine, for this example, self.conf is a merge of CLI args and hard coded args into a python dict.
I'm trying to test against the if conditions. How can I patch or pass fake values to self.val_a and self.val_b so that I can raise the exceptions and have a test case against it?
Since val_a comes from CLI, I can patch sys.argv as patch("sys.argv", self.cli_arguments), but how can I patch or is it possible to pass a fake value to self.val_b during instantiation?
I don't want to patch the call to load_config because I want to run separate tests that confirm the if statements get triggered.
I'm trying to access to the "resources" folder with the ArgumentParser.
This code and the "resources" folder are in the same folder...
Just to try to run the code, I've put a print function in the predict function. However this error occurs:
predict.py: error: the following arguments are required: resources_path
How can I fix it?
from argparse import ArgumentParser
def parse_args():
parser = ArgumentParser()
parser.add_argument("resources_path", help='/resources')
return parser.parse_args()
def predict(resources_path):
print(resources_path)
pass
if __name__ == '__main__':
args = parse_args()
predict(args.resources_path)
I am guessing from your error message that you are trying to call your program like this:
python predict.py
The argument parser by default gets the arguments from sys.argv, i.e. the command line. You'll have to pass it yourself like this:
python predict.py resources
It's possible that you want the resources argument to default to ./resources if you don't pass anything. (And I further assume you want ./resources, not /resources.) There's a keyword argument for that:
....
parser.add_argument('resources_path', default='./resources')
...
With the Python CLI library argh I want to write a wrapper tool. This wrapper tool is suppose to read the two options -a and -b and to pass all other options to a function (which then calls the wrapped UNIX tool with the left-over options via subprocess).
I have experimented with dispatch's parameter skip_unknown_args:
def wrapper(a=True, b=False):
print("Enter wrapper")
# 1. process a and b
# 2. call_unix_tool(left-over-args)
if __name__ == '__main__':
parser = argh.ArghParser()
argh.set_default_command(parser, wrapper)
argh.dispatch(parser, skip_unknown_args=True)
However the program still does exit when it encounters unknown options and it does not enter the function wrapper as needed. Additionally I don't know where the unknown/skipped arguments are stored, so that I can pass them to the UNIX tool.
How can I tell argh to go into wrapper with the skipped arguments?
I believe this is a bug.
when skip_unknown_args=True, here namespace_obj is a tuple, with a namespace object and remaining args:
(Pdb) p namespace_obj
(ArghNamespace(_functions_stack=[<function wrapper at 0x105cb5e18>], a=False, b=True), ['-c'])
underlying _get_function_from_namespace_obj expects an unary one:
154 function = _get_function_from_namespace_obj(namespace_obj)
...
191 if isinstance(namespace_obj, ArghNamespace):
I checked its coressponding issue and unittest, no idea what is the legitmate behivour the author expects, have dropped a comment there as well.
why not use argparse directly?
You cannot do this with skip_unknown_args=True, because as #georgexsh pointed out the argh library doesn't seem to behave sensibly with that option. However you can provide your own parser class which injects the unknown arguments into the normal namespace:
class ArghParserWithUnknownArgs(argh.ArghParser):
def parse_args(self, args=None, namespace=None):
namespace = namespace or ArghNamespace()
(namespace_obj, unknown_args) = super(ArghParserWithUnknownArgs, self).parse_known_args(args=args, namespace=namespace)
namespace_obj.__dict__['unknown_args'] = unknown_args
return namespace_obj
Note that this class' parse_args method calls ArgParser's parse_known_args method!
With this class defined you can write the wrapper code following way:
def wrapper(a=True, b=False, unknown_args={}):
print("a = %s, b = %s" % (a,b))
print("unknown_args = %s" % unknown_args)
if __name__ == '__main__':
parser = ArghParserWithUnknownArgs()
argh.set_default_command(parser, wrapper)
argh.dispatch(parser)
In your main function wrapper you can access all unknown arguments via the parameter unknown_args and pass this on to your subprocess command
ps: In order to keep the help message tidy decorate wrapper with
#argh.arg('--unknown_args', help=argparse.SUPPRESS)
Addendum: I created an enhanced version of the parser and compiled it into a ready-to-use module. Find it on Github.
If I import a Python module that is already using argparse, however, I would like to use argparse in my script as well ...how should I go about doing this?
I'm receiving a unrecognized arguments error when using the following code and invoking the script with a -t flag:
Snippet:
#!/usr/bin/env python
....
import conflicting_module
import argparse
...
#################################
# Step 0: Configure settings... #
#################################
parser = argparse.ArgumentParser(description='Process command line options.')
parser.add_argument('--test', '-t')
Error:
unrecognized arguments: -t foobar
You need to guard your imported modules with
if __name__ == '__main__':
...
against it running initialization code such as argument parsing on import. See What does if __name__ == "__main__": do?.
So, in your conflicting_module do
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Process command line options in conflicting_module.py.')
parser.add_argument('--conflicting', '-c')
...
instead of just creating the parser globally.
If the parsing in conflicting_module is a mandatory part of application configuration, consider using
args, rest = parser.parse_known_args()
in your main module and passing rest to conflicting_module, where you'd pass either None or rest to parse_args:
args = parser.parse_args(rest)
That is still a bit bad style and actually the classes and functions in conflicting_module would ideally receive parsed configuration arguments from your main module, which would be responsible for parsing them.