Stop parsing on first unknown argument - python

Using argparse, is it possible to stop parsing arguments at the first unknown argument?
I've found 2 almost solutions;
parse_known_args, but this allows for known parameters to be detected after the first unknown argument.
nargs=argparse.REMAINDER, but this won't stop parsing until the first non-option argument. Any options preceding this that aren't recognised generate an error.
Have I overlooked something? Should I be using argparse at all?

I haven't used argparse myself (need to keep my code 2.6-compatible), but looking through the docs, I don't think you've missed anything.
So I have to wonder why you want argparse to stop parsing arguments, and why the -- pseudo-argument won't do the job. From the docs:
If you have positional arguments that must begin with '-' and don’t look like negative numbers, you can insert the pseudo-argument '--' which tells parse_args() that everything after that is a positional argument:
>>> parser.parse_args(['--', '-f'])
Namespace(foo='-f', one=None)

One way to do it, although it may not be perfect in all situations, is to use getopt instead.
for example:
import sys
import os
from getopt import getopt
flags, args = getopt(sys.argv[1:], 'hk', ['help', 'key='])
for flag, v in flags:
if flag in ['-h', '--help']:
print(USAGE, file=sys.stderr)
os.exit()
elif flag in ['-k', '--key']:
key = v
Once getopt encounters a non-option argument it will stop processing arguments.

Related

Problem using '-' inside an optional argument when using Python Argparse

I am parsing an argument input:
python parser_test.py --p "-999,-99;-9"
I get this error:
parser_test.py: error: argument --p: expected one argument
Is there a particular reason why including '-' in the optional argument
"-999,-99;-9"
throws the error even while within double quotes? I need to be able to include the '-' sign.
Here is the code:
import argparse
def main():
parser = argparse.ArgumentParser(description='Input command line arguments for the averaging program')
parser.add_argument('--p', help='input the missing data filler as an integer')
args = parser.parse_args()
if __name__=='__main__':
main()
The quotes do nothing to alter how argparse treats the -; the only purpose they serve is to prevent the shell from treating the ; as a command terminator.
argparse looks at all the arguments first and identifies which ones might be options, regardless of what options are actually defined, by checking which ones start with -. It makes an exception for things that could be negative numbers (like -999), but only if there are no defined options that look like numbers.
The solution is to prevent argparse from seeing -999,-99;-9 as a separate argument. Make it part of the argument that contains the -p using the --name=value form.
python parser_test.py --p="-999,-99;-9"
You can also use "--p=-999,-99;-9" or --p=-999,-99\;-9, among many other possibilities for writing an argument that will cause the shell to parse your command line as two separate commands, python parser_test.py --p-999,-99 and -9.

Need to embed `-` character into arguments in python argparse

I am designing a tool to meet some spec. I have a scenario where I want the argument to contain - its string. Pay attention to arg-1 in the below line.
python test.py --arg-1 arg1Data
I am using the argparse library on python27. For some reason the argparse gets confused with the above trial.
My question is how to avoid this? How can I keep the - in my argument?
A sample program (containing the -, if this is removed everything works fine):
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--arg-1", help="increase output verbosity")
args = parser.parse_args()
if args.args-1:
print "verbosity turned on"
Python argparse module replace dashes by underscores, thus:
if args.arg_1:
print "verbosity turned on"
Python doc (second paragraph of section 15.4.3.11. dest) states:
Any internal - characters will be converted to _ characters to make
sure the string is a valid attribute name.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--arg-1", help="increase output verbosity")
parser.add_argument("arg-2")
args = parser.parse_args()
print(args)
produces:
1750:~/mypy$ python stack34970533.py -h
usage: stack34970533.py [-h] [--arg-1 ARG_1] arg-2
positional arguments:
arg-2
optional arguments:
-h, --help show this help message and exit
--arg-1 ARG_1 increase output verbosity
and
1751:~/mypy$ python stack34970533.py --arg-1 xxx yyy
Namespace(arg-2='yyy', arg_1='xxx')
The first argument is an optional. You can use '--arg-1' in commandline, but the value is stored as args.arg_1. Python would interpret args.arg-1 as args.arg - 1. There's a long history of unix commandlines allowing flags with a -. It tries to balance both traditions.
It leaves you in full control of the positionals dest attribute, and does not change the - to _. If you want to access that you have to use the getattr approach. There is bug/issue discussing whether this behavior should be changed or not. But for now, if you want to make it hard on yourself, that's your business.
Internally, argparse accesses the namespace with getattr and setattr to minimize restrictions on the attribute names.

Order of the arguments matters in getopt

My application parses the command line arguments:
import sys
import getopt
arguments = sys.argv[1:]
options, remainder = getopt.getopt(arguments, "aa:bb:cc:dd:h", ["aaaa=", "bbbb=", "cccc=", "dddd=", "help"])
print dict(options)
This works great but at the same time odd: if I put the arguments in the different order, they aren't get parsed
python my_app.py --aaaa=value1 --bbbb=value2 --cccc=value3 --dddd=value4 #ok
python my_app.py --dddd=value4 --bbbb=value2 --cccc=value3 --aaaa=value1 # empty
That's disappointing because the order of the arguments shouldn't matter, should it? Is there any way to solve that?
UPDATE:
python my_app.py -aa value1 # odd, empty { "-a" : "" }
python my_app.py -a value1 # even this empty { "-a" : "" }
As stated in the first comment to your question, your main example regarding failed parsing of arguments in a different order works just fine:
~/tmp/so$ python my_app.py --aaaa=value1 --bbbb=value2 --cccc=value3 --dddd=value4
{'--aaaa': 'value1', '--cccc': 'value3', '--dddd': 'value4', '--bbbb': 'value2'}
~/tmp/so$ python my_app.py --dddd=value4 --bbbb=value2 --cccc=value3 --aaaa=value1
{'--cccc': 'value3', '--bbbb': 'value2', '--aaaa': 'value1', '--dddd': 'value4'}
If that's not the case for you, please update the script to print the remainder as well, and show its output.
However, you have still misused the getopt library and that's the reason the latest examples you provided don't work as expected. You can't specify more than a single character as an option, since the second character would count as a new separate option. getopt provides no way to differentiate between two consecutive characters that count as a single option (with the first one carrying no argument value, as it is not followed by a colon) or a single option that is composed of two characters. From getopt.getopt's documentation, with my added emphasis:
options is the string of option letters that the script wants to recognize, with options that require an argument followed by a colon.
Therefore, when getopt parses your arguments, each time it encounters a -a argument, it associates it with the first a option it notices, which in your case is not followed by a colon. Thus, it sets this option, discards its argument value, if there was any (if -aa was passed as an argument to the script, the second a counts as the argument value) and moves on to the next argument.
Finally, regarding getopt and argparse. The documentation clearly advocates argparse:
The getopt module is a parser for command line options whose API is designed to be familiar to users of the C getopt() function. Users who are unfamiliar with the C getopt() function or who would like to write less code and get better help and error messages should consider using the argparse module instead.
More about why argparse is better than both getopt and the deprecated optparse can be read in this PEP and in the answers to this question.
The only functionality that I've found to be supported in getopt while it requires a bit of work in argparse is argument order permutation like that of gnu getopt. However, this question explains how this can be achieved via argparse.

Why isn't getopt working if sys.argv is passed fully?

If I'm using this with getopt:
import getopt
import sys
opts,args = getopt.getopt(sys.argv,"a:bc")
print opts
print args
opts will be empty. No tuples will be created. If however, I'll use sys.argv[1:], everything works as expected. I don't understand why that is. Anyone care to explain?
The first element of sys.argv (sys.argv[0]) is the name of the script currently being executed. Because this script name is (likely) not a valid argument (and probably doesn't begin with a - or -- anyway), getopt does not recognize it as an argument. Due to the nature of how getopt works, when it sees something that is not a command-line flag (something that does not begin with - or --), it stops processing command-line options (and puts the rest of the arguments into args), because it assumes the rest of the arguments are items that will be handled by the program (such as filenames or other "required" arguments).
It's by design. Recall that sys.argv[0] is the running program name, and getopt doesn't want it.
From the docs:
Parses command line options and
parameter list. args is the argument
list to be parsed, without the leading
reference to the running program.
Typically, this means sys.argv[1:].
options is the string of option
letters that the script wants to
recognize, with options that require
an argument followed by a colon (':';
i.e., the same format that Unix
getopt() uses).
http://docs.python.org/library/getopt.html

With Python's optparse module, how do you create an option that takes a variable number of arguments?

With Perl's Getopt::Long you can easily define command-line options that take a variable number of arguments:
foo.pl --files a.txt --verbose
foo.pl --files a.txt b.txt c.txt --verbose
Is there a way to do this directly with Python's optparse module? As far as I can tell, the nargs option attribute can be used to specify a fixed number of option arguments, and I have not seen other alternatives in the documentation.
This took me a little while to figure out, but you can use the callback action to your options to get this done. Checkout how I grab an arbitrary number of args to the "--file" flag in this example.
from optparse import OptionParser,
def cb(option, opt_str, value, parser):
args=[]
for arg in parser.rargs:
if arg[0] != "-":
args.append(arg)
else:
del parser.rargs[:len(args)]
break
if getattr(parser.values, option.dest):
args.extend(getattr(parser.values, option.dest))
setattr(parser.values, option.dest, args)
parser=OptionParser()
parser.add_option("-q", "--quiet",
action="store_false", dest="verbose",
help="be vewwy quiet (I'm hunting wabbits)")
parser.add_option("-f", "--filename",
action="callback", callback=cb, dest="file")
(options, args) = parser.parse_args()
print options.file
print args
Only side effect is that you get your args in a list instead of tuple. But that could be easily fixed, for my particular use case a list is desirable.
My mistake: just found this Callback Example 6.
I believe optparse does not support what you require (not directly -- as you noticed, you can do it if you're willing to do all the extra work of a callback!-). You could also do it most simply with the third-party extension argparse, which does support variable numbers of arguments (and also adds several other handy bits of functionality).
This URL documents argparse's add_argument -- passing nargs='*' lets the option take zero or more arguments, '+' lets it take one or more arguments, etc.
Wouldn't you be better off with this?
foo.pl --files a.txt,b.txt,c.txt --verbose
I recently has this issue myself: I was on Python 2.6 and needed an option to take a variable number of arguments. I tried to use Dave's solution but found that it wouldn't work without also explicitly setting nargs to 0.
def arg_list(option, opt_str, value, parser):
args = set()
for arg in parser.rargs:
if arg[0] == '-':
break
args.add(arg)
parser.rargs.pop(0)
setattr(parser.values, option.dest, args)
parser=OptionParser()
parser.disable_interspersed_args()
parser.add_option("-f", "--filename", action="callback", callback=arg_list,
dest="file", nargs=0)
(options, args) = parser.parse_args()
The problem was that, by default, a new option added by add_options is assumed to have nargs = 1 and when nargs > 0 OptionParser will pop items off rargs and assign them to value before any callbacks are called. Thus, for options that do not specify nargs, rargs will always be off by one by the time your callback is called.
This callback is can be used for any number of options, just have callback_args be the function to be called instead of setattr.
Here's one way: Take the fileLst generating string in as a string and then use http://docs.python.org/2/library/glob.html to do the expansion ugh this might not work without escaping the *
Actually, got a better way:
python myprog.py -V -l 1000 /home/dominic/radar/*.json <- If this is your command line
parser, opts = parse_args()
inFileLst = parser.largs

Categories

Resources