Problem using '-' inside an optional argument when using Python Argparse - python

I am parsing an argument input:
python parser_test.py --p "-999,-99;-9"
I get this error:
parser_test.py: error: argument --p: expected one argument
Is there a particular reason why including '-' in the optional argument
"-999,-99;-9"
throws the error even while within double quotes? I need to be able to include the '-' sign.
Here is the code:
import argparse
def main():
parser = argparse.ArgumentParser(description='Input command line arguments for the averaging program')
parser.add_argument('--p', help='input the missing data filler as an integer')
args = parser.parse_args()
if __name__=='__main__':
main()

The quotes do nothing to alter how argparse treats the -; the only purpose they serve is to prevent the shell from treating the ; as a command terminator.
argparse looks at all the arguments first and identifies which ones might be options, regardless of what options are actually defined, by checking which ones start with -. It makes an exception for things that could be negative numbers (like -999), but only if there are no defined options that look like numbers.
The solution is to prevent argparse from seeing -999,-99;-9 as a separate argument. Make it part of the argument that contains the -p using the --name=value form.
python parser_test.py --p="-999,-99;-9"
You can also use "--p=-999,-99;-9" or --p=-999,-99\;-9, among many other possibilities for writing an argument that will cause the shell to parse your command line as two separate commands, python parser_test.py --p-999,-99 and -9.

Related

Prevent expansion of wildcards in non-quoted python script argument when running in UNIX environment

I have a python script that I'd like to supply with an argument (usually) containing wildcards, referring to a series of files that I'd like to do stuff with. Example here:
#!/usr/bin/env python
import argparse
import glob
parser = argparse.ArgumentParser()
parser.add_argument('-i', action="store", dest="i")
results = parser.parse_args()
print 'argument i is: ', results.i
list_of_matched_files = glob.glob(results.i)
In this case, everything works great if the user adds quotes to the passed argument like so:
./test_script.py -i "foo*.txt"
...but often times the users forget to add quotes to the argument and are stumped when the list only contains the first match because UNIX already expanded the list and argparse only then gets the first list element.
Is there a way (within the script) to prevent UNIX from expanding the list before passing it to python? Or maybe even just to test if the argument doesn't contain quotes and then warn the user?
No. Wildcards are expanded by the shell (Bash, zsh, csh, fish, whatever) before the script even runs, and the script can't do anything about them. Testing whether the argument contains quotes also won't work, as the shell similarly strips the quotes from "foo*.txt" before passing the argument to the script, so all Python sees is foo*.txt.
Its not UNIX that is doing the expansion, it is the shell.
Bash has an option set -o noglob (or -f) which turns off globbing (filename expansion), but that is non-standard.
If you give an end-user access to the command-line then they really should know about quoting. For example, the commonly used find command has a -name parameter which can take glob constructs but they have to be quoted in a similar manner. Your program is no different to any other.
If users can't handle that then maybe you should give them a different interface. You could go to the extreme of writing a GUI or a web/HTML front-end, but that's probably over the top.
Or why not prompt for the filename pattern? You could, for example, use a -p option to indicate prompting, e.g:
import argparse
import glob
parser = argparse.ArgumentParser()
parser.add_argument('-i', action="store", dest="i")
parser.add_argument('-p', action="store_true", default=False)
results = parser.parse_args()
if results.p:
pattern = raw_input("Enter filename pattern: ")
else:
pattern = results.i
list_of_matched_files = glob.glob(pattern)
print list_of_matched_files
(I have assumed Python 2 because of your print statement)
Here the input is not read by the shell but by python, which will not expand glob constructs unless you ask it to.
You can disable the expansion using set -f from the command line. (re-enable with set +f).
As jwodder correctly says though, this happens before the script is run, so the only way I can think of to do this is to wrap it with a shell script that disables expansion temporarily, runs the python script, and re-enables expansion. Preventing UNIX from expanding the list before passing it to python is not possible.
Here is an example for the Bash shell that shows what #Tom Wyllie is talking about:
alias sea='set -f; search_function'
search_function() { perl /home/scripts/search.pl $# ; set +f; }
This defines an alias called "sea" that:
Turns off expansion ("set -f")
Runs the search_function function which is a perl script
Turns expansion back on ("set +f")
The problem with this is that if a user stops execution with ^C or some such then the expansion may not be turned back on leaving the user puzzling why "ls *" is not working. So I'm not necessarily advocating using this. :).
This worked for me:
files = sys.argv[1:]
Even though only one string is on the command line, the shell expands the wildcards and fills sys.argv[] with the list.

Python: Subprocess works different to terminal. What I have to change?

I have to Python scripts: Tester1.py and Tester2.py.
Within Tester1 I want to start from time to time Tester2.py. I also want to pass Tester2.py some arguments. At the moment my code looks like this:
Tester1:
subprocess.call(['python3 Tester2.py testString'])
Tester2:
def start():
message = sys.argv[1]
print(message)
start()
Now my problem: If I run with my terminal Tester2 like 'python3 Tester2.py testString'my console prints out testString. But if I run Tester1 and Tester1 tries to start Tester2, I get an IndexError: "list index out of range".
How do I need to change my code to get everything working?
EDIT:
niemmi told me that I have to change my code to:
subprocess.call(['python3', 'Tester2.py', 'testString'])
but now I get a No such file or directory Error although both scripts are in the same directory. Someone knows why?
You need to provide the arguments either as separate elements on a list or as a string:
subprocess.call(['python3', 'Tester2.py', 'testString'])
# or
subprocess.call('python3 Tester2.py testString')
Python documentation has following description:
args is required for all calls and should be a string, or a sequence of program arguments. Providing a sequence of arguments is generally preferred, as it allows the module to take care of any required escaping and quoting of arguments (e.g. to permit spaces in file names). If passing a single string, either shell must be True (see below) or else the string must simply name the program to be executed without specifying any arguments.

Need to embed `-` character into arguments in python argparse

I am designing a tool to meet some spec. I have a scenario where I want the argument to contain - its string. Pay attention to arg-1 in the below line.
python test.py --arg-1 arg1Data
I am using the argparse library on python27. For some reason the argparse gets confused with the above trial.
My question is how to avoid this? How can I keep the - in my argument?
A sample program (containing the -, if this is removed everything works fine):
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--arg-1", help="increase output verbosity")
args = parser.parse_args()
if args.args-1:
print "verbosity turned on"
Python argparse module replace dashes by underscores, thus:
if args.arg_1:
print "verbosity turned on"
Python doc (second paragraph of section 15.4.3.11. dest) states:
Any internal - characters will be converted to _ characters to make
sure the string is a valid attribute name.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--arg-1", help="increase output verbosity")
parser.add_argument("arg-2")
args = parser.parse_args()
print(args)
produces:
1750:~/mypy$ python stack34970533.py -h
usage: stack34970533.py [-h] [--arg-1 ARG_1] arg-2
positional arguments:
arg-2
optional arguments:
-h, --help show this help message and exit
--arg-1 ARG_1 increase output verbosity
and
1751:~/mypy$ python stack34970533.py --arg-1 xxx yyy
Namespace(arg-2='yyy', arg_1='xxx')
The first argument is an optional. You can use '--arg-1' in commandline, but the value is stored as args.arg_1. Python would interpret args.arg-1 as args.arg - 1. There's a long history of unix commandlines allowing flags with a -. It tries to balance both traditions.
It leaves you in full control of the positionals dest attribute, and does not change the - to _. If you want to access that you have to use the getattr approach. There is bug/issue discussing whether this behavior should be changed or not. But for now, if you want to make it hard on yourself, that's your business.
Internally, argparse accesses the namespace with getattr and setattr to minimize restrictions on the attribute names.

Stop parsing on first unknown argument

Using argparse, is it possible to stop parsing arguments at the first unknown argument?
I've found 2 almost solutions;
parse_known_args, but this allows for known parameters to be detected after the first unknown argument.
nargs=argparse.REMAINDER, but this won't stop parsing until the first non-option argument. Any options preceding this that aren't recognised generate an error.
Have I overlooked something? Should I be using argparse at all?
I haven't used argparse myself (need to keep my code 2.6-compatible), but looking through the docs, I don't think you've missed anything.
So I have to wonder why you want argparse to stop parsing arguments, and why the -- pseudo-argument won't do the job. From the docs:
If you have positional arguments that must begin with '-' and don’t look like negative numbers, you can insert the pseudo-argument '--' which tells parse_args() that everything after that is a positional argument:
>>> parser.parse_args(['--', '-f'])
Namespace(foo='-f', one=None)
One way to do it, although it may not be perfect in all situations, is to use getopt instead.
for example:
import sys
import os
from getopt import getopt
flags, args = getopt(sys.argv[1:], 'hk', ['help', 'key='])
for flag, v in flags:
if flag in ['-h', '--help']:
print(USAGE, file=sys.stderr)
os.exit()
elif flag in ['-k', '--key']:
key = v
Once getopt encounters a non-option argument it will stop processing arguments.

Why isn't getopt working if sys.argv is passed fully?

If I'm using this with getopt:
import getopt
import sys
opts,args = getopt.getopt(sys.argv,"a:bc")
print opts
print args
opts will be empty. No tuples will be created. If however, I'll use sys.argv[1:], everything works as expected. I don't understand why that is. Anyone care to explain?
The first element of sys.argv (sys.argv[0]) is the name of the script currently being executed. Because this script name is (likely) not a valid argument (and probably doesn't begin with a - or -- anyway), getopt does not recognize it as an argument. Due to the nature of how getopt works, when it sees something that is not a command-line flag (something that does not begin with - or --), it stops processing command-line options (and puts the rest of the arguments into args), because it assumes the rest of the arguments are items that will be handled by the program (such as filenames or other "required" arguments).
It's by design. Recall that sys.argv[0] is the running program name, and getopt doesn't want it.
From the docs:
Parses command line options and
parameter list. args is the argument
list to be parsed, without the leading
reference to the running program.
Typically, this means sys.argv[1:].
options is the string of option
letters that the script wants to
recognize, with options that require
an argument followed by a colon (':';
i.e., the same format that Unix
getopt() uses).
http://docs.python.org/library/getopt.html

Categories

Resources