Why isn't getopt working if sys.argv is passed fully? - python

If I'm using this with getopt:
import getopt
import sys
opts,args = getopt.getopt(sys.argv,"a:bc")
print opts
print args
opts will be empty. No tuples will be created. If however, I'll use sys.argv[1:], everything works as expected. I don't understand why that is. Anyone care to explain?

The first element of sys.argv (sys.argv[0]) is the name of the script currently being executed. Because this script name is (likely) not a valid argument (and probably doesn't begin with a - or -- anyway), getopt does not recognize it as an argument. Due to the nature of how getopt works, when it sees something that is not a command-line flag (something that does not begin with - or --), it stops processing command-line options (and puts the rest of the arguments into args), because it assumes the rest of the arguments are items that will be handled by the program (such as filenames or other "required" arguments).

It's by design. Recall that sys.argv[0] is the running program name, and getopt doesn't want it.
From the docs:
Parses command line options and
parameter list. args is the argument
list to be parsed, without the leading
reference to the running program.
Typically, this means sys.argv[1:].
options is the string of option
letters that the script wants to
recognize, with options that require
an argument followed by a colon (':';
i.e., the same format that Unix
getopt() uses).
http://docs.python.org/library/getopt.html

Related

Problem using '-' inside an optional argument when using Python Argparse

I am parsing an argument input:
python parser_test.py --p "-999,-99;-9"
I get this error:
parser_test.py: error: argument --p: expected one argument
Is there a particular reason why including '-' in the optional argument
"-999,-99;-9"
throws the error even while within double quotes? I need to be able to include the '-' sign.
Here is the code:
import argparse
def main():
parser = argparse.ArgumentParser(description='Input command line arguments for the averaging program')
parser.add_argument('--p', help='input the missing data filler as an integer')
args = parser.parse_args()
if __name__=='__main__':
main()
The quotes do nothing to alter how argparse treats the -; the only purpose they serve is to prevent the shell from treating the ; as a command terminator.
argparse looks at all the arguments first and identifies which ones might be options, regardless of what options are actually defined, by checking which ones start with -. It makes an exception for things that could be negative numbers (like -999), but only if there are no defined options that look like numbers.
The solution is to prevent argparse from seeing -999,-99;-9 as a separate argument. Make it part of the argument that contains the -p using the --name=value form.
python parser_test.py --p="-999,-99;-9"
You can also use "--p=-999,-99;-9" or --p=-999,-99\;-9, among many other possibilities for writing an argument that will cause the shell to parse your command line as two separate commands, python parser_test.py --p-999,-99 and -9.

ConfigArgParse ignores abbreviated options?

my config.ini:
banana=original_banana
if I run with full argument name, I get the expected result:
python test_configargparse.py --banana new_banana
new_banana
if I run with abbreviated argument name (--ban instead of --banana), I get unexpected behaviour:
python test_configargparse.py --ban new_banana
original_banana
code for test_configargparse.py
import os, configargparse as ap
parser = ap.ArgumentParser(default_config_files=["config.ini"])
parser.add_argument('--banana',dest='banana')
options = parser.parse_args()
print(options.banana)
versions = ConfigArgParse==0.13.0, Python 2.7.10
is this a bug or am I missing something obvious?? it's a very basic feature in a very established module...
NOTE: this feature is explicitly documented in https://docs.python.org/3/library/argparse.html
allows long options to be abbreviated to a prefix, if the abbreviation is unambiguous (the prefix matches a unique option)
It looks like a bug in ConfigArgParse. When it loads options from the config file, it discards any option that is already on the command line.
discard_this_key = already_on_command_line(
args, action.option_strings)
The bug is that already_on_command_line() only checks for complete argument names, not prefixes.
def already_on_command_line(existing_args_list, potential_command_line_args):
"""Utility method for checking if any of the potential_command_line_args is
already present in existing_args.
"""
return any(potential_arg in existing_args_list
for potential_arg in potential_command_line_args)
That leaves two copies of the argument in the list, with the config file's value second. ArgumentParser takes the second value.

Prevent expansion of wildcards in non-quoted python script argument when running in UNIX environment

I have a python script that I'd like to supply with an argument (usually) containing wildcards, referring to a series of files that I'd like to do stuff with. Example here:
#!/usr/bin/env python
import argparse
import glob
parser = argparse.ArgumentParser()
parser.add_argument('-i', action="store", dest="i")
results = parser.parse_args()
print 'argument i is: ', results.i
list_of_matched_files = glob.glob(results.i)
In this case, everything works great if the user adds quotes to the passed argument like so:
./test_script.py -i "foo*.txt"
...but often times the users forget to add quotes to the argument and are stumped when the list only contains the first match because UNIX already expanded the list and argparse only then gets the first list element.
Is there a way (within the script) to prevent UNIX from expanding the list before passing it to python? Or maybe even just to test if the argument doesn't contain quotes and then warn the user?
No. Wildcards are expanded by the shell (Bash, zsh, csh, fish, whatever) before the script even runs, and the script can't do anything about them. Testing whether the argument contains quotes also won't work, as the shell similarly strips the quotes from "foo*.txt" before passing the argument to the script, so all Python sees is foo*.txt.
Its not UNIX that is doing the expansion, it is the shell.
Bash has an option set -o noglob (or -f) which turns off globbing (filename expansion), but that is non-standard.
If you give an end-user access to the command-line then they really should know about quoting. For example, the commonly used find command has a -name parameter which can take glob constructs but they have to be quoted in a similar manner. Your program is no different to any other.
If users can't handle that then maybe you should give them a different interface. You could go to the extreme of writing a GUI or a web/HTML front-end, but that's probably over the top.
Or why not prompt for the filename pattern? You could, for example, use a -p option to indicate prompting, e.g:
import argparse
import glob
parser = argparse.ArgumentParser()
parser.add_argument('-i', action="store", dest="i")
parser.add_argument('-p', action="store_true", default=False)
results = parser.parse_args()
if results.p:
pattern = raw_input("Enter filename pattern: ")
else:
pattern = results.i
list_of_matched_files = glob.glob(pattern)
print list_of_matched_files
(I have assumed Python 2 because of your print statement)
Here the input is not read by the shell but by python, which will not expand glob constructs unless you ask it to.
You can disable the expansion using set -f from the command line. (re-enable with set +f).
As jwodder correctly says though, this happens before the script is run, so the only way I can think of to do this is to wrap it with a shell script that disables expansion temporarily, runs the python script, and re-enables expansion. Preventing UNIX from expanding the list before passing it to python is not possible.
Here is an example for the Bash shell that shows what #Tom Wyllie is talking about:
alias sea='set -f; search_function'
search_function() { perl /home/scripts/search.pl $# ; set +f; }
This defines an alias called "sea" that:
Turns off expansion ("set -f")
Runs the search_function function which is a perl script
Turns expansion back on ("set +f")
The problem with this is that if a user stops execution with ^C or some such then the expansion may not be turned back on leaving the user puzzling why "ls *" is not working. So I'm not necessarily advocating using this. :).
This worked for me:
files = sys.argv[1:]
Even though only one string is on the command line, the shell expands the wildcards and fills sys.argv[] with the list.

Using OptionParser vs sys.argv

For a script, I'm currently using OptionParser to add variables to an input. However, all of my current options are booleans, and it seems it would just be easier to parse using argv instead. For example:
$ script.py option1 option4 option6
And then do something like:
if 'option1' in argv:
do this
if 'option2' in argv:
do this
etc...
Would it be suggested to use argv over OptionParser when the optionals are all booleans?
"However, all of my current options are booleans, and it seems it
would just be easier to parse using argv instead."
There's nothing wrong with using argv, and if it's simpler to use argv, there's no reason not to.
OptionParser has been deprecated, and unless you're stuck on an older version of python, you should use the ArgParser module.
For one-off scripts, there's nothing wrong with parsing sys.argv yourself. There are some advantages to using an argument parsing module instead of writing your own.
Standardized. Do you allow options like "-test", because the standard is usually 2 underscores for multichar options (e.g. "--test"). With a module, you don't have to worry about defining standards because they're already defined.
Do you need error-catching and help messages? Because you get a lot of that for free with ArgParse.
Will someone else be maintaining your code? There's already lots of documentation and examples of ArgParse. Plus, it's somewhat self documenting, because you have to specify the type and number of arguments, which isn't always apparent from looking at a sys.argv parser.
Basically, if you ever expect your command line options to change over time, or expect that your code will have to be modified by someone else, the overhead of ArgParse isn't that bad and would probably save you time in the future.

Stop parsing on first unknown argument

Using argparse, is it possible to stop parsing arguments at the first unknown argument?
I've found 2 almost solutions;
parse_known_args, but this allows for known parameters to be detected after the first unknown argument.
nargs=argparse.REMAINDER, but this won't stop parsing until the first non-option argument. Any options preceding this that aren't recognised generate an error.
Have I overlooked something? Should I be using argparse at all?
I haven't used argparse myself (need to keep my code 2.6-compatible), but looking through the docs, I don't think you've missed anything.
So I have to wonder why you want argparse to stop parsing arguments, and why the -- pseudo-argument won't do the job. From the docs:
If you have positional arguments that must begin with '-' and don’t look like negative numbers, you can insert the pseudo-argument '--' which tells parse_args() that everything after that is a positional argument:
>>> parser.parse_args(['--', '-f'])
Namespace(foo='-f', one=None)
One way to do it, although it may not be perfect in all situations, is to use getopt instead.
for example:
import sys
import os
from getopt import getopt
flags, args = getopt(sys.argv[1:], 'hk', ['help', 'key='])
for flag, v in flags:
if flag in ['-h', '--help']:
print(USAGE, file=sys.stderr)
os.exit()
elif flag in ['-k', '--key']:
key = v
Once getopt encounters a non-option argument it will stop processing arguments.

Categories

Resources