parsing mutually exclusive optional and positional arguments followed by pass-thru arguments - python

I am trying to emulate the Python interpreter command-line behavior, as would be represented by the help text:
command [options] [-m mod | file] [arg] ...
That is:
any number of arbitrary options (which are of the form -[a-zA-Z] that serve as a flag or with a single argument)
one of:
-m mod
file
zero or more arguments which should be available as-is
I have tried using the built-in argparse module, but unsuccessfully.
import argparse
parser = argparse.ArgumentParser()
selector = parser.add_mutually_exclusive_group(required=True)
selector.add_argument('file', nargs='?', help='path to script')
selector.add_argument('-m', help='module name')
parser.add_argument('args', nargs=argparse.REMAINDER)
parser.parse_args(['-m', 'hello', '--', 'arg1'])
Running this yields
usage: test.py [-h] [-m M] [file] ...
test.py: error: argument file: not allowed with argument -m
which makes sense, given that argparse seems to generally disregard the ordering of options - any positional arguments remaining after parsing options fill-in the positional arguments from first to last as specified.
I have tried defining custom argparse.Actions to do the job but it ends up looking pretty hacky, since the Action class corresponding to one of the arguments in the group needs to save an accumulated value for later inclusion in args.
I have also tried pre-processing the input to parser.parse_args, but do not like that approach since information about which options have values (to distinguish an option argument from the file argument) and which options are part of the group of terminal arguments (which should be considered the start of the pass-thru arguments [arg] ...) would be duplicated between the argparse.add_argument... calls and the pre-processing code.
What would be a good approach (other than requiring path to be provided with e.g. -f)?
Additional constraints:
I prefer to use argparse or something with a nice interface that correlates the arguments to help text and doesn't take long to load (argparse imports in 6ms for me)
I only need be compatible with Python 3.6 and above.
It is not ideal, but I am OK requiring users to include -- as the first arg if subsequent arguments (which would be passed through to the module or file) start with a - or may otherwise be mistaken for something in [options].

Even without the mutually exclusive grouping, file and args don't play nicely together:
In [2]: parser = argparse.ArgumentParser()
In [3]: parser.add_argument('-m');
In [4]: parser.add_argument('file', nargs='?');
In [6]: parser.add_argument('args', nargs=argparse.REMAINDER);
OK:
In [7]: parser.parse_args('-m foo a b c '.split())
Out[7]: Namespace(args=['b', 'c'], file='a', m='foo')
'--' just lets us use '-b' as a plain string:
In [8]: parser.parse_args('-m foo a -- -b c '.split())
Out[8]: Namespace(args=['-b', 'c'], file='a', m='foo')
'a' goes to 'file', and rest to 'args' - that's because all 'contiguous' positionals are evaluated together. With remainder, the -m flag is ignored, and treated like a plain string.
In [9]: parser.parse_args('a -m foo -- -b c '.split())
Out[9]: Namespace(args=['-m', 'foo', '--', '-b', 'c'], file='a', m=None)
In [10]: parser.parse_args('a -- -b c '.split())
Out[10]: Namespace(args=['-b', 'c'], file='a', m=None)
Argument allocation occurs even before the Action is called, so custom Action classes don't alter this behavior.
Flagged arguments give you the best control - over order and mutual-exclusivity.

Related

Argparse: How to switch from default parser to a different subparser when a certain optional argument is given?

I have a certain script that is normally called with 2 positional arguments and bunch of optional arguments.
script.py <file1> <file2>
I want to add another subparser which should be called when I pass an optional argument.
script.py -file_list <files.list>
Basically, what I require is that when -file_list is passed, the parser shouldn't look for file1 and file2. I do not want the default case to require another option to invoke it (since the default case is already in use and thus I do not want to break it).
I tried keeping the default parser as is and creating subparser for -file_list. But the parser still expects the positional arguments file1 and file2.
Sample code (this doesn't work as I want it to):
args = argparse.ArgumentParser()
#default arguments
args.add_argument("file1", type=str)
args.add_argument("file2", type=str)
#subparser for file_list
file_list_sp = args.add_subparsers()
file_list_parser = file_list_sp.ad_parser("-file_list")
file_list_parser.add_argument("file_list")
all_args = args.parse_args()
Maybe I need to create a seperate subparser for the default case; but all subparsers seem to need an extra command to invoke them. I want default case to be invoked automatically whenever -file_list is not passed
You mention in passing other optionals, so I assume there's good incentive to keep argparse. Assuming there aren't other positionals, you could try:
In [23]: parser = argparse.ArgumentParser()
In [24]: parser.add_argument('--other'); # other optionals
In [25]: parser.add_argument('--filelist');
In [26]: parser.add_argument('files', nargs='*');
This accepts the '--filelist', without requiring the positional files, (with a resulting empty list):
In [27]: parser.parse_args('--filelist alist'.split())
Out[27]: Namespace(other=None, filelist='alist', files=[])
Other the files - but restricting that list to 2 requires your own testing after parsing:
In [28]: parser.parse_args('file1 file2'.split())
Out[28]: Namespace(other=None, filelist=None, files=['file1', 'file2'])
But it also accepts both forms. Again your own post parsing code will have to sort out the conflicting message:
In [29]: parser.parse_args('file1 file2 --filelist xxx'.split())
Out[29]: Namespace(other=None, filelist='xxx', files=['file1', 'file2'])
And you have to deal with the case where neither is provided:
In [30]: parser.parse_args('--other foobar'.split())
Out[30]: Namespace(other='foobar', filelist=None, files=[])
The help:
In [32]: parser.print_help()
usage: ipykernel_launcher.py [-h] [--other OTHER] [--filelist FILELIST]
[files ...]
positional arguments:
files
optional arguments:
-h, --help show this help message and exit
--other OTHER
--filelist FILELIST
So what I've created accepts both forms of input, but does not constrain them.
I tried using a mutually_exclusive_group, but that only works with nargs='?'. I thought at one time nargs='*' worked in a group, but either my memory is wrong, or there was a patch that changed things. A positional in a m-x-group has to have 'required=False' value.
A subparsers is actually a special kind of positional. So if you created parser with usage 'prog file1 file2 {cmd1, cmd2}' it would still expect the 2 file names before checking on the third string. All positionals, including subparsers are handled by position, not value. A flagged argument (optional) can occur in any order, but does not override the positionals.
And if you define the subparsers first, then the first positional has to be one of those commands. subparsers are not required (by default), but that doesn't change the handling of other positionals.
working mutually exclusive group
nargs='*' does work if we also specify a default. This is a undocumented detail. I forgot about it until I looked at the code (specifically the _get_positional_kwargs method). A * is marked as required unless it's given a default. It's been discussed in one or more bug/issue some time ago.
So can make positional list and '--file_list' mutually exclusive
In [71]: parser = argparse.ArgumentParser()
In [72]: parser.add_argument('--other'); # other optionals
In [73]: group = parser.add_mutually_exclusive_group(required=True)
In [74]: group.add_argument('--file_list');
In [75]: group.add_argument('files', nargs='*', default=[]);
Good cases:
In [76]: parser.parse_args('--other foobar file1 file2'.split())
Out[76]: Namespace(other='foobar', file_list=None, files=['file1', 'file2'])
In [77]: parser.parse_args('--other foobar --file_list file3'.split())
Out[77]: Namespace(other='foobar', file_list='file3', files=[])
If both are used:
In [78]: parser.parse_args('--other foobar --file_list file3 file1 file2'.split())
usage: ipykernel_launcher.py [-h] [--other OTHER] [--file_list FILE_LIST]
[files ...]
ipykernel_launcher.py: error: argument files: not allowed with argument --file_list
And if neither is provided:
In [79]: parser.parse_args('--other foobar'.split())
usage: ipykernel_launcher.py [-h] [--other OTHER] [--file_list FILE_LIST]
[files ...]
ipykernel_launcher.py: error: one of the arguments --file_list files is required
Usage now looks like:
In [81]: parser.print_usage()
usage: ipykernel_launcher.py [-h] [--other OTHER] [--file_list FILE_LIST]
[files ...]
It still doesn't control the number of files - except there must be atleast one.

Use Pythons argparse to make an argument being a flag or accepting a parameter

I am using argparse in Python 3. My requirement is to support the following 3 use-cases:
$ python3 foo.py --test <== results e.g. in True
$ python3 foo.py --test=foo <== results in foo
$ python3 foo.py <== results in arg.test is None or False
I found store_true, and store, but can't find any configuration where I can achieve the following arg list as mentioned above. Either it wants to have a parameter, or None. Both seems not to work. Is there any way to make a argument optional?
Use nargs='?' to make the argument optional, const=True for when the flag is given with no argument, and default=False for when the flag is not given.
Here's the explanation and an example from the documentation on nargs='?':
'?'. One argument will be consumed from the command line if possible, and produced as a single item. If no command-line argument is present, the value from default will be produced. Note that for optional arguments, there is an additional case - the option string is present but not followed by a command-line argument. In this case the value from const will be produced. Some examples to illustrate this:
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo', nargs='?', const='c', default='d')
>>> parser.add_argument('bar', nargs='?', default='d')
>>> parser.parse_args(['XX', '--foo', 'YY'])
Namespace(bar='XX', foo='YY')
>>> parser.parse_args(['XX', '--foo'])
Namespace(bar='XX', foo='c')
>>> parser.parse_args([])
Namespace(bar='d', foo='d')

'argparse' with optional positional arguments that start with dash

We're trying to build a wrapper script over a command line tool we're using. We would like to set some tool arguments based on options in our wrapper scripts. We would also like to have the possibility to pass native arguments to the command line tool directly as they are written on the command line.
Here is what we came up with:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('positional')
parser.add_argument('-f', '--foo', action='store_true')
parser.add_argument('-b', '--bar', action='store_true')
parser.add_argument('native_arg', nargs='*')
args = parser.parse_args()
print (args)
positional is mandatory. Based on the options -f and -b we would add some extra options to our tool call. Anything that is left afterwards (if anything) should be treated as a native tool argument and given to the tool directly. Calling our script with -h produces the following usage:
usage: test.py [-h] [-f] [-b] positional [native_arg [native_arg ...]]
The trick is that these native arguments are themselves options for the tool and contain leading dashes, for example -native0 and -native1. We already know about the trick with the double dash to stop argparse from looking for more options. The following call:
./test.py pos -- -native0 -native1
produces the expected parsed arguments:
Namespace(bar=False, foo=False, native_arg=['-native0', '-native1'], positional='pos')
Trying to add an option after the first positional argument doesn't work, though. More specifically, the following call:
./test.py pos --foo -- -native0 -native1
produces the following output:
usage: [...shortened...]
test.py: error: unrecognized arguments: -- -native0 -native1
Putting the optional arguments before the positionals:
./test.py --foo pos -- -native0 -native1
seems to work, as the following is printed:
Namespace(bar=False, foo=True, native_arg=['-native0', '-native1'], positional='pos')
Even stranger, changing the value of nargs for native_arg to '+' works in all the above situations (with the caveat, of course, that at least one native_arg is expected).
Are we doing something wrong in our Python code or is this some kind of argparse bug?
argparse does have a hard time when you mix non-required positional arguments with optional arguments (see https://stackoverflow.com/a/47208725/1399279 for details into the bug report). Rather than suggesting a way to solve this issue, I am going to present an alternative approach.
You should check out the parse_known_args method, which was created for the situation you describe (i.e. passing options to a wrapped tool).
In [1]: import argparse
In [2]: parser = argparse.ArgumentParser()
In [3]: parser.add_argument('positional')
In [4]: parser.add_argument('-f', '--foo', action='store_true')
In [5]: parser.add_argument('-b', '--bar', action='store_true')
In [6]: parser.parse_known_args(['pos', '--foo', '-native0', '-native1'])
Out[6]: (Namespace(bar=False, foo=True, positional='pos'), ['-native0', '-native1'])
Unlike parse_args, the output of parse_known_args is a two-element tuple. The first element is the Namespace instance you would expect to get from parse_args, and it contains all the attributes defined by calls to add_argument. The second element is a list of all the arguments not known to the parser.
I personally prefer this method because the user does not need to remember any tricks about how to call your program, or which option order does not result in errors.
This is a known issue (https://bugs.python.org/issue15112, argparse: nargs='*' positional argument doesn't accept any items if preceded by an option and another positional)
The parsing alternates handling positionals and optionals. When dealing with positionals it tries to handle as many as the input strings require. But an ? or * positional is satisfied with [], an empty list of strings. + on the other hand requires at least one string
./test.py pos --foo -- -native0 -native1
The parser gives 'pos' to positional, and [] to native-arg. Then it gives '--foo' to its optional. There aren't anymore positionals left to hand the remaining strings, so it raises the error.
The allocation of input strings is done with a stylized form of regex string matching. Imagine matching a pattern that looks like AA?.
To correct this, parser would have to look ahead, and delay handling native-arg. We've suggested patches but they aren't in production.
#SethMMorton's suggestion of using parse_known_args is a good one.
Earlier parsers (e.g. Optparse) handle all the flagged arguments, but return the rest, the positionals, as a undifferentiated list. It's up to the user to split that list. argparse has added the ability to name and parse positionals, but the algorithm works best with fixed nargs, and gets flaky with too many variable nargs.

Defining the order of the arguments with argparse - Python

I have the following command line tool:
import argparse
parser = argparse.ArgumentParser(description = "A cool application.")
parser.add_argument('positional')
parser.add_argument('--optional1')
parser.add_argument('--optional2')
args = parser.parse_args()
print args.positionals
The output of python args.py is:
usage: args.py [-h] [--optional1 OPTIONAL1] [--optional2 OPTIONAL2]
positional
however I would like to have:
usage: args.py [-h] positional [--optional1 OPTIONAL1] [--optional2 OPTIONAL2]
How could I have that reordering?
You would either have to provide your own help formatter, or specify an explicit usage string:
parser = argparse.ArgumentParser(
description="A cool application.",
usage="args.py [-h] positional [--optional1 OPTIONAL1] [--optional2 OPTIONAL2]")
The order in the help message, though, does not affect the order in which you can specify the arguments. argparse processes any defined options left-to-right, then assigns any remaining arguments to the positional parameters from left to right. Options and positional arguments can, for the most part, be mixed.
With respect to each other the order of positionals is fixed - that's why they are called that. But optionals (the flagged arguments) can occur in any order, and usually can be interspersed with the postionals (there are some practical constrains when allowing variable length nargs.)
For the usage line, argparse moves the positionals to the end of the list, but that just a display convention.
There have been SO questions about changing that display order, but I think that is usually not needed. If you must change the display order, using a custom usage parameter is the simplest option. The programming way requires subclassing the help formatter and modifying a key method.

Argparse: Optional arguments, distinct for different positional arguments

I want to have positional arguments with an optional argument. Smth like my_command foo --version=0.1 baz bar --version=0.2. That should parse into a list of args [foo, bar, baz] with version set for 2 of them.
Without optional argument it's trivial to set nargs=* or nargs=+, but I'm struggling with providing an optional argument for positional ones. Is it even possible with argparse?
Multiple invocation of the same subcommand in a single command line
This tries to parse something like
$ python test.py executeBuild --name foobar1 executeBuild --name foobar2 ....
Both proposed solutions call a parser multiple times. Each call handles a cmd --name value pair. One splits sys.argv before hand, the other collects unparsed strings with a argparse.REMAINDER argument.
Normally optionals can occur in any order. They are identified solely by that - flag value. Positionals have to occur in a particular order, but optionals may occur BETWEEN different positionals. Note also that in the usage display, optionals are listed first, followed by positionals.
usage: PROG [-h] [--version Version] [--other OTHER] FOO BAR BAZ
subparsers are the only way to link a positional argument with one or more optionals. But normally a parser is allowed to have only one subparsers argument.
Without subparsers, append is the only way to collect data from repeated uses of an optional:
parser.add_argument('--version',action='append')
parser.add_argument('foo')
parser.add_argument('bar')
parser.add_argument('baz')
would handle your input string, producing a namespace like:
namespace(version=['0.1','0.2'],foo='foo',bar='bar',baz='baz')
But there's no way of identifying baz as the one that is 'missing' a version value.
Regarding groups - argument groups just affect the help display. They have nothing to do with parsing.
How would you explain to your users (or yourself 6 mths from now) how to use this interface? What would the usage and help look like? It might be simpler to change the design to something that is easier to implement and to explain.
Here's a script which handles your sample input.
import argparse
usage = 'PROG [cmd [--version VERSION]]*'
parser = argparse.ArgumentParser(usage=usage)
parser.add_argument('cmd')
parser.add_argument('-v','--version')
parser.add_argument('rest', nargs=argparse.PARSER)
parser.print_usage()
myargv = 'foo --version=0.1 baz bar --version=0.2'.split()
# myargv = sys.argv[1:] # in production
myargv += ['quit'] # end loop flag
args = argparse.Namespace(rest=myargv)
collect = argparse.Namespace(cmd=[])
while True:
args = parser.parse_args(args.rest)
collect.cmd += [(args.cmd, args.version)]
print(args)
if args.rest[0]=='quit':
break
print collect
It repeatedly parses a positional and optional, collecting the rest in a argparse.PARSER argument. This is like + in that it requires at least one string, but it collects ones that look like optionals as well. I needed to add a quit string so it wouldn't throw an error when there wasn't anything to fill this PARSER argument.
producing:
usage: PROG [cmd [--version VERSION]]*
Namespace(cmd='foo', rest=['baz', 'bar', '--version=0.2', 'quit'], version='0.1')
Namespace(cmd='baz', rest=['bar', '--version=0.2', 'quit'], version=None)
Namespace(cmd='bar', rest=['quit'], version='0.2')
Namespace(cmd=[('foo', '0.1'), ('baz', None), ('bar', '0.2')])
The positional argument that handles subparsers also uses this nargs value. That's how it recognizes and collects a cmd string plus everything else.
So it is possible to parse an argument string such as you want. But I'm not sure the code complexity is worth it. The code is probably fragile as well, tailored to this particular set of arguments, and just a few variants.

Categories

Resources