'argparse' with optional positional arguments that start with dash - python

We're trying to build a wrapper script over a command line tool we're using. We would like to set some tool arguments based on options in our wrapper scripts. We would also like to have the possibility to pass native arguments to the command line tool directly as they are written on the command line.
Here is what we came up with:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('positional')
parser.add_argument('-f', '--foo', action='store_true')
parser.add_argument('-b', '--bar', action='store_true')
parser.add_argument('native_arg', nargs='*')
args = parser.parse_args()
print (args)
positional is mandatory. Based on the options -f and -b we would add some extra options to our tool call. Anything that is left afterwards (if anything) should be treated as a native tool argument and given to the tool directly. Calling our script with -h produces the following usage:
usage: test.py [-h] [-f] [-b] positional [native_arg [native_arg ...]]
The trick is that these native arguments are themselves options for the tool and contain leading dashes, for example -native0 and -native1. We already know about the trick with the double dash to stop argparse from looking for more options. The following call:
./test.py pos -- -native0 -native1
produces the expected parsed arguments:
Namespace(bar=False, foo=False, native_arg=['-native0', '-native1'], positional='pos')
Trying to add an option after the first positional argument doesn't work, though. More specifically, the following call:
./test.py pos --foo -- -native0 -native1
produces the following output:
usage: [...shortened...]
test.py: error: unrecognized arguments: -- -native0 -native1
Putting the optional arguments before the positionals:
./test.py --foo pos -- -native0 -native1
seems to work, as the following is printed:
Namespace(bar=False, foo=True, native_arg=['-native0', '-native1'], positional='pos')
Even stranger, changing the value of nargs for native_arg to '+' works in all the above situations (with the caveat, of course, that at least one native_arg is expected).
Are we doing something wrong in our Python code or is this some kind of argparse bug?

argparse does have a hard time when you mix non-required positional arguments with optional arguments (see https://stackoverflow.com/a/47208725/1399279 for details into the bug report). Rather than suggesting a way to solve this issue, I am going to present an alternative approach.
You should check out the parse_known_args method, which was created for the situation you describe (i.e. passing options to a wrapped tool).
In [1]: import argparse
In [2]: parser = argparse.ArgumentParser()
In [3]: parser.add_argument('positional')
In [4]: parser.add_argument('-f', '--foo', action='store_true')
In [5]: parser.add_argument('-b', '--bar', action='store_true')
In [6]: parser.parse_known_args(['pos', '--foo', '-native0', '-native1'])
Out[6]: (Namespace(bar=False, foo=True, positional='pos'), ['-native0', '-native1'])
Unlike parse_args, the output of parse_known_args is a two-element tuple. The first element is the Namespace instance you would expect to get from parse_args, and it contains all the attributes defined by calls to add_argument. The second element is a list of all the arguments not known to the parser.
I personally prefer this method because the user does not need to remember any tricks about how to call your program, or which option order does not result in errors.

This is a known issue (https://bugs.python.org/issue15112, argparse: nargs='*' positional argument doesn't accept any items if preceded by an option and another positional)
The parsing alternates handling positionals and optionals. When dealing with positionals it tries to handle as many as the input strings require. But an ? or * positional is satisfied with [], an empty list of strings. + on the other hand requires at least one string
./test.py pos --foo -- -native0 -native1
The parser gives 'pos' to positional, and [] to native-arg. Then it gives '--foo' to its optional. There aren't anymore positionals left to hand the remaining strings, so it raises the error.
The allocation of input strings is done with a stylized form of regex string matching. Imagine matching a pattern that looks like AA?.
To correct this, parser would have to look ahead, and delay handling native-arg. We've suggested patches but they aren't in production.
#SethMMorton's suggestion of using parse_known_args is a good one.
Earlier parsers (e.g. Optparse) handle all the flagged arguments, but return the rest, the positionals, as a undifferentiated list. It's up to the user to split that list. argparse has added the ability to name and parse positionals, but the algorithm works best with fixed nargs, and gets flaky with too many variable nargs.

Related

Python argparse with subparsers and optional positional arguments

I would like to have a program with subparsers that handles specific arguments while also keep some positional and optional arguments to the previous parsers (In fact what I really want is only one option, I mean, a valid subparser OR a valid local argument).
Example of something I wish to have: Program [{sectionName [{a,b}]}] [{c,d}]. Being c/d incompatible if sectionName was provided and viceversa.
However, the best I could achieve is this test.py [-h] {sectionName} ... [{c,d}]. This means, argparse don't allow me to use the positional arguments c or d without specifying a valid sectionName.
Here is the code:
import argparse
mainparser = argparse.ArgumentParser()
# Subparser
subparser = mainparser.add_subparsers(title="section", required=False)
subparser_parser = subparser.add_parser("sectionName")
subparser_parser.add_argument("attribute", choices=['a', 'b'], nargs='?')
# Main parser positional and optional attributes
mainparser.add_argument("attribute", choices=['c', 'd'], nargs='?')
mainparser.parse_args()
I'm getting crazy with this. Any help would be much appreciated!
Edit: I'm using Python 3.8
The subparser object is actually a positional Action, one that takes choices - in this case {'sectionName'}. positinal arguments are filled in the order that they are defined, using the nargs pattern to allocate strings.
Once the main parser gets the 'sectionName' it passes the parsing to subparser_parser. Than handles the rest of the input, such as the {'a','b'} positional. Anything it can't handle is put on the 'unrecognized' list, and control returns main for final processing. main does not do any further argument processing. Thus your attribute argument is ignored.
You could put define a attribute positional before the add_subparsers, but I wouldn't try to make it nargs='?'.
So it's best to define all main arguments before the subparsers, and to use optionals. This will give the cleanest and most reliable parsing.

parsing mutually exclusive optional and positional arguments followed by pass-thru arguments

I am trying to emulate the Python interpreter command-line behavior, as would be represented by the help text:
command [options] [-m mod | file] [arg] ...
That is:
any number of arbitrary options (which are of the form -[a-zA-Z] that serve as a flag or with a single argument)
one of:
-m mod
file
zero or more arguments which should be available as-is
I have tried using the built-in argparse module, but unsuccessfully.
import argparse
parser = argparse.ArgumentParser()
selector = parser.add_mutually_exclusive_group(required=True)
selector.add_argument('file', nargs='?', help='path to script')
selector.add_argument('-m', help='module name')
parser.add_argument('args', nargs=argparse.REMAINDER)
parser.parse_args(['-m', 'hello', '--', 'arg1'])
Running this yields
usage: test.py [-h] [-m M] [file] ...
test.py: error: argument file: not allowed with argument -m
which makes sense, given that argparse seems to generally disregard the ordering of options - any positional arguments remaining after parsing options fill-in the positional arguments from first to last as specified.
I have tried defining custom argparse.Actions to do the job but it ends up looking pretty hacky, since the Action class corresponding to one of the arguments in the group needs to save an accumulated value for later inclusion in args.
I have also tried pre-processing the input to parser.parse_args, but do not like that approach since information about which options have values (to distinguish an option argument from the file argument) and which options are part of the group of terminal arguments (which should be considered the start of the pass-thru arguments [arg] ...) would be duplicated between the argparse.add_argument... calls and the pre-processing code.
What would be a good approach (other than requiring path to be provided with e.g. -f)?
Additional constraints:
I prefer to use argparse or something with a nice interface that correlates the arguments to help text and doesn't take long to load (argparse imports in 6ms for me)
I only need be compatible with Python 3.6 and above.
It is not ideal, but I am OK requiring users to include -- as the first arg if subsequent arguments (which would be passed through to the module or file) start with a - or may otherwise be mistaken for something in [options].
Even without the mutually exclusive grouping, file and args don't play nicely together:
In [2]: parser = argparse.ArgumentParser()
In [3]: parser.add_argument('-m');
In [4]: parser.add_argument('file', nargs='?');
In [6]: parser.add_argument('args', nargs=argparse.REMAINDER);
OK:
In [7]: parser.parse_args('-m foo a b c '.split())
Out[7]: Namespace(args=['b', 'c'], file='a', m='foo')
'--' just lets us use '-b' as a plain string:
In [8]: parser.parse_args('-m foo a -- -b c '.split())
Out[8]: Namespace(args=['-b', 'c'], file='a', m='foo')
'a' goes to 'file', and rest to 'args' - that's because all 'contiguous' positionals are evaluated together. With remainder, the -m flag is ignored, and treated like a plain string.
In [9]: parser.parse_args('a -m foo -- -b c '.split())
Out[9]: Namespace(args=['-m', 'foo', '--', '-b', 'c'], file='a', m=None)
In [10]: parser.parse_args('a -- -b c '.split())
Out[10]: Namespace(args=['-b', 'c'], file='a', m=None)
Argument allocation occurs even before the Action is called, so custom Action classes don't alter this behavior.
Flagged arguments give you the best control - over order and mutual-exclusivity.

Defining the order of the arguments with argparse - Python

I have the following command line tool:
import argparse
parser = argparse.ArgumentParser(description = "A cool application.")
parser.add_argument('positional')
parser.add_argument('--optional1')
parser.add_argument('--optional2')
args = parser.parse_args()
print args.positionals
The output of python args.py is:
usage: args.py [-h] [--optional1 OPTIONAL1] [--optional2 OPTIONAL2]
positional
however I would like to have:
usage: args.py [-h] positional [--optional1 OPTIONAL1] [--optional2 OPTIONAL2]
How could I have that reordering?
You would either have to provide your own help formatter, or specify an explicit usage string:
parser = argparse.ArgumentParser(
description="A cool application.",
usage="args.py [-h] positional [--optional1 OPTIONAL1] [--optional2 OPTIONAL2]")
The order in the help message, though, does not affect the order in which you can specify the arguments. argparse processes any defined options left-to-right, then assigns any remaining arguments to the positional parameters from left to right. Options and positional arguments can, for the most part, be mixed.
With respect to each other the order of positionals is fixed - that's why they are called that. But optionals (the flagged arguments) can occur in any order, and usually can be interspersed with the postionals (there are some practical constrains when allowing variable length nargs.)
For the usage line, argparse moves the positionals to the end of the list, but that just a display convention.
There have been SO questions about changing that display order, but I think that is usually not needed. If you must change the display order, using a custom usage parameter is the simplest option. The programming way requires subclassing the help formatter and modifying a key method.

Why is there a difference when calling argparse.parse_args() or .parse_args(sys.argv)

I have created the following argument parser in my python code.
parser = argparse.ArgumentParser()
parser.add_argument('projectPath')
parser.add_argument('-project')
parser.add_argument('-release')
parser.add_argument('--test', default=False, action='store_true')
args = parser.parse_args()
and I'm executing my program the following way.
myProgram.py /path/to/file -project super --test
it works fine if I use the sysntax above with
args = parser.parse_args()
However if I take and use the sys.argv as input
args = parser.parse_args(sys.argv)
The parser is suddenly picky about the order of the arguments and I get the unrecognized argument error.
usage: fbu.py [-h] [-project PROJECT] [-release RELEASE] [--test] projectPath
fbu.py: error: unrecognized arguments: /path/to/file
As I can see from the error and also using the -h argument. The path argument must be last and the error makes sense in the last example.
But why does it not care about the order in the first example ?
EDIT: I'm using python version 3.4.3
sys.argv contains the script name as the first item, i.e. myProgram.py. That argument takes the spot of projectPath. Now there's one additional positional argument /path/to/file, which can't be matched to any arguments, hence the error.
Calling parse_args without arguments ArgumentParser is clever enough to omit the script name from being parsed. But when explicitly passing an array of arguments, it can't do that and will parse everything.
As you can see from looking at the source code for parse_known_args (which is called by parse_args):
if args is None:
# args default to the system args
args = _sys.argv[1:]
When you don't provide the arguments explicitly, Python removes the first item from .argv (which is the name of the script). If you pass the arguments manually, you must do this yourself:
parser.parse_args(sys.argv[1:])
This isn't explicitly covered in the documentation, but note that this section doesn't include a script name when calling parse_args manually:
Beyond sys.argv
Sometimes it may be useful to have an ArgumentParser parse arguments
other than those of sys.argv. This can be accomplished by passing a
list of strings to parse_args(). This is useful for testing at the
interactive prompt:
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument(
... 'integers', metavar='int', type=int, choices=xrange(10),
... nargs='+', help='an integer in the range 0..9')
>>> parser.add_argument(
... '--sum', dest='accumulate', action='store_const', const=sum,
... default=max, help='sum the integers (default: find the max)')
>>> parser.parse_args(['1', '2', '3', '4'])
Namespace(accumulate=<built-in function max>, integers=[1, 2, 3, 4])
>>> parser.parse_args('1 2 3 4 --sum'.split())
Namespace(accumulate=<built-in function sum>, integers=[1, 2, 3, 4])
The advantage of passing the arguments manually is that it makes it easier to test the parsing functionality, as you can pass in a list of appropriate arguments rather than trying to patch sys.argv.

Argparse: Optional arguments, distinct for different positional arguments

I want to have positional arguments with an optional argument. Smth like my_command foo --version=0.1 baz bar --version=0.2. That should parse into a list of args [foo, bar, baz] with version set for 2 of them.
Without optional argument it's trivial to set nargs=* or nargs=+, but I'm struggling with providing an optional argument for positional ones. Is it even possible with argparse?
Multiple invocation of the same subcommand in a single command line
This tries to parse something like
$ python test.py executeBuild --name foobar1 executeBuild --name foobar2 ....
Both proposed solutions call a parser multiple times. Each call handles a cmd --name value pair. One splits sys.argv before hand, the other collects unparsed strings with a argparse.REMAINDER argument.
Normally optionals can occur in any order. They are identified solely by that - flag value. Positionals have to occur in a particular order, but optionals may occur BETWEEN different positionals. Note also that in the usage display, optionals are listed first, followed by positionals.
usage: PROG [-h] [--version Version] [--other OTHER] FOO BAR BAZ
subparsers are the only way to link a positional argument with one or more optionals. But normally a parser is allowed to have only one subparsers argument.
Without subparsers, append is the only way to collect data from repeated uses of an optional:
parser.add_argument('--version',action='append')
parser.add_argument('foo')
parser.add_argument('bar')
parser.add_argument('baz')
would handle your input string, producing a namespace like:
namespace(version=['0.1','0.2'],foo='foo',bar='bar',baz='baz')
But there's no way of identifying baz as the one that is 'missing' a version value.
Regarding groups - argument groups just affect the help display. They have nothing to do with parsing.
How would you explain to your users (or yourself 6 mths from now) how to use this interface? What would the usage and help look like? It might be simpler to change the design to something that is easier to implement and to explain.
Here's a script which handles your sample input.
import argparse
usage = 'PROG [cmd [--version VERSION]]*'
parser = argparse.ArgumentParser(usage=usage)
parser.add_argument('cmd')
parser.add_argument('-v','--version')
parser.add_argument('rest', nargs=argparse.PARSER)
parser.print_usage()
myargv = 'foo --version=0.1 baz bar --version=0.2'.split()
# myargv = sys.argv[1:] # in production
myargv += ['quit'] # end loop flag
args = argparse.Namespace(rest=myargv)
collect = argparse.Namespace(cmd=[])
while True:
args = parser.parse_args(args.rest)
collect.cmd += [(args.cmd, args.version)]
print(args)
if args.rest[0]=='quit':
break
print collect
It repeatedly parses a positional and optional, collecting the rest in a argparse.PARSER argument. This is like + in that it requires at least one string, but it collects ones that look like optionals as well. I needed to add a quit string so it wouldn't throw an error when there wasn't anything to fill this PARSER argument.
producing:
usage: PROG [cmd [--version VERSION]]*
Namespace(cmd='foo', rest=['baz', 'bar', '--version=0.2', 'quit'], version='0.1')
Namespace(cmd='baz', rest=['bar', '--version=0.2', 'quit'], version=None)
Namespace(cmd='bar', rest=['quit'], version='0.2')
Namespace(cmd=[('foo', '0.1'), ('baz', None), ('bar', '0.2')])
The positional argument that handles subparsers also uses this nargs value. That's how it recognizes and collects a cmd string plus everything else.
So it is possible to parse an argument string such as you want. But I'm not sure the code complexity is worth it. The code is probably fragile as well, tailored to this particular set of arguments, and just a few variants.

Categories

Resources