What is the canonical way of handling sys arguments in Python?

What is the canonical way of handling sys arguments in Python? - python

Let's say I want to make a hashing script:
### some code here
def hashlib_based(path, htype='md5', block_size=2**16):
hash = eval(htype)
with open(path, 'rb') as f:
for block in iter(lambda: f.read(block_size), ''):
hash().update(block)
f.close()
return hash().hexdigest()
### some code here
As you can see, I have the opportunity to use different flags to allow me to change the hash type or the block size when I call the script from the command line (for example ./myscript.py -sha1 -b 512 some_file.ext). The thing is, I don't have any clue on how should I do this in order to keep my code as clean and readable as possible. How do I deal with sys.argv?
First of all, how do I check if the user uses the correct flags? I need to do that in order to print out a usage message. Do I make a list with all the flags, then I check if the user uses one that is in that list?
Should I do all these things inside main() or should I do place them in a different function?
Should I construct my flags with a hyphen-minus in front of them (like this: -a, -b) or without one? To check if a certain flag is present in sys.argv, do I simply do something like:
if '-v' in sys.argv:
verbose = True
?
Because sys.argv has indexes, what is the best way to ignore the order of the flags - or in other words, should ./myscript.py -a -b be the same as ./myscript.py -b -a? While it certainly makes the job easier for the common user, is it common practice to do so?
I saw something similar but for C#. Is there a similar concept in Python?
The thing is, as simple as these things are, they get out of hands quickly - for me at least. I end up doing a mess. What is your approach to this problem?

For really simple use cases, such as checking the presence of one argument, you can do a check like you're showing, i.e.:
if '-v' in sys.argv: ...
which is the quick'n dirty way of checking arguments. But once your project gets a bit more serious, you definitely need to use an argument parsing library.
And there are a few ones to handle argument parsing: there is the now deprecated getopt (I won't give a link), the most common one is argparse which is included in any python distribution.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-a', '--a-long', help='a help')
parser.add_argument('-b', '--b-long', help='b help')
args = parser.parse_args()
then you can call your script -a -b or script -b -a which will be equivalent. And for free, you've got script -h for free! :-)
Though, my preference is now over docopt, imho, which is way simpler and more elegant, for the same example:
"""
My script.
usage:
myscript -a | --along
myscript -b | --blong
Options:
-a --along a help
-b --blong b help
"""
from docopt import docopt
arguments = docopt(__doc__, version='myscript 1.0')
print(arguments)
HTH

Related

Python - argparse - (Singleton) Argument w/ optional parameter [duplicate]

I'm trying to implement the following argument dependency using the argparse module:
./prog [-h | [-v schema] file]
meaning the user must pass either -h or a file, if a file is passed the user can optionally pass -v schema.
That's what I have now but that doesn't seem to be working:
import argparse
parser = argparse.ArgumentParser()
mtx = parser.add_mutually_exclusive_group()
mtx.add_argument('-h', ...)
grp = mtx.add_argument_group()
grp.add_argument('-v', ...)
grp.add_argument('file', ...)
args = parser.parse_args()
It looks like you can't add an arg group to a mutex group or am I missing something?

If -h means the default help, then this is all you need (this help is already exclusive)
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('file')
parser.add_argument('-s','--schema')
parser.parse_args('-h'.split()) # parser.print_help()
producing
usage: stack23951543.py [-h] [-s SCHEMA] file
...
If by -h you mean some other action, lets rename it -x. This would come close to what you describe
parser = argparse.ArgumentParser()
parser.add_argument('-s','--schema', default='meaningful default value')
mxg = parser.add_mutually_exclusive_group(required=True)
mxg.add_argument('-x','--xxx', action='store_true')
mxg.add_argument('file', nargs='?')
parser.parse_args('-h'.split())
usage is:
usage: stack23951543.py [-h] [-s SCHEMA] (-x | file)
Now -x or file is required (but not both). -s is optional in either case, but with a meaningful default, it doesn't matter if it is omitted. And if -x is given, you can just ignore the -s value.
If necessary you could test args after parsing, to confirm that if args.file is not None, then args.schema can't be either.
Earlier I wrote (maybe over thinking the question):
An argument_group cannot be added to a mutually_exclusive_group. The two kinds of groups have different purposes and functions. There are previous SO discussions of this (see 'related'), as well as a couple of relevant Python bug issues. If you want tests that go beyond a simple mutually exclusive group, you probably should do your own testing after parse_args. That may also require your own usage line.
An argument_group is just a means of grouping and labeling arguments in the help section.
A mutually_exclusive_group affects the usage formatting (if it can), and also runs tests during parse_args. The use of 'group' for both implies that they are more connected than they really are.
http://bugs.python.org/issue11588 asks for nested groups, and the ability to test for 'inclusivity' as well. I tried to make the case that 'groups' aren't general enough to express all the kinds of testing that users want. But it's one thing to generalize the testing mechanism, and quite another to come up with an intuitive API. Questions like this suggest that argparse does need some sort of 'nested group' syntax.

Prevent expansion of wildcards in non-quoted python script argument when running in UNIX environment

I have a python script that I'd like to supply with an argument (usually) containing wildcards, referring to a series of files that I'd like to do stuff with. Example here:
#!/usr/bin/env python
import argparse
import glob
parser = argparse.ArgumentParser()
parser.add_argument('-i', action="store", dest="i")
results = parser.parse_args()
print 'argument i is: ', results.i
list_of_matched_files = glob.glob(results.i)
In this case, everything works great if the user adds quotes to the passed argument like so:
./test_script.py -i "foo*.txt"
...but often times the users forget to add quotes to the argument and are stumped when the list only contains the first match because UNIX already expanded the list and argparse only then gets the first list element.
Is there a way (within the script) to prevent UNIX from expanding the list before passing it to python? Or maybe even just to test if the argument doesn't contain quotes and then warn the user?

No. Wildcards are expanded by the shell (Bash, zsh, csh, fish, whatever) before the script even runs, and the script can't do anything about them. Testing whether the argument contains quotes also won't work, as the shell similarly strips the quotes from "foo*.txt" before passing the argument to the script, so all Python sees is foo*.txt.

Its not UNIX that is doing the expansion, it is the shell.
Bash has an option set -o noglob (or -f) which turns off globbing (filename expansion), but that is non-standard.
If you give an end-user access to the command-line then they really should know about quoting. For example, the commonly used find command has a -name parameter which can take glob constructs but they have to be quoted in a similar manner. Your program is no different to any other.
If users can't handle that then maybe you should give them a different interface. You could go to the extreme of writing a GUI or a web/HTML front-end, but that's probably over the top.
Or why not prompt for the filename pattern? You could, for example, use a -p option to indicate prompting, e.g:
import argparse
import glob
parser = argparse.ArgumentParser()
parser.add_argument('-i', action="store", dest="i")
parser.add_argument('-p', action="store_true", default=False)
results = parser.parse_args()
if results.p:
pattern = raw_input("Enter filename pattern: ")
else:
pattern = results.i
list_of_matched_files = glob.glob(pattern)
print list_of_matched_files
(I have assumed Python 2 because of your print statement)
Here the input is not read by the shell but by python, which will not expand glob constructs unless you ask it to.

You can disable the expansion using set -f from the command line. (re-enable with set +f).
As jwodder correctly says though, this happens before the script is run, so the only way I can think of to do this is to wrap it with a shell script that disables expansion temporarily, runs the python script, and re-enables expansion. Preventing UNIX from expanding the list before passing it to python is not possible.

Here is an example for the Bash shell that shows what #Tom Wyllie is talking about:
alias sea='set -f; search_function'
search_function() { perl /home/scripts/search.pl $# ; set +f; }
This defines an alias called "sea" that:
Turns off expansion ("set -f")
Runs the search_function function which is a perl script
Turns expansion back on ("set +f")
The problem with this is that if a user stops execution with ^C or some such then the expansion may not be turned back on leaving the user puzzling why "ls *" is not working. So I'm not necessarily advocating using this. :).

This worked for me:
files = sys.argv[1:]
Even though only one string is on the command line, the shell expands the wildcards and fills sys.argv[] with the list.

How to require one command line action argument among several possible but exclusive?

Using argparse, is there a simple way to specify arguments which are mutually exclusive so that the application asks for one of these arguments have to be provided but only one of them?
Example of fictive use-case:
> myapp.py foo --bar
"Foo(bar) - DONE"
> myapp.py read truc.txt
"Read: truc.txt - DONE"
>myapp.py foo read
Error: use "myapp.py foo [options]" or "myapp.py read [options]" (or something similar).
> myapp.py foo truc.txt
Error: "foo" action don't need additional info.
> myapp.py read --bar
Error: "read" action don't have a "--bar" option.
My goal is to have a "driver" application(1) that would internally apply one action depending on the first command line argument and have arguments depending on the action.
So far I see no obvious ways to do this with argparse without manually processing the arguments myself, but maybe I missed something Pythonic? (I'm not a Python3 expert...yet)
I call it "driver" because it might be implemented by calling another application, like gcc does with different compilers.

What you're trying to do is actually supported quite well in Python.
See Mutual Exclusion
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument('foo', dest='foo', nargs=1)
group.add_argument('read', dest='read', nargs=1)
args = parser.parse_args()
return args

optparse(): Input validation

My apology in advance if it's already answered somewhere; I've been in the python site since last hr. but didn't quite figure out how I can I do this. My script should take the options like this:
myScript.py -f <file-name> -e [/ -d]
myScript.py -s <string> -e [/ -d]
myScript.py -f <file-name> [/ -s <string>] -e -w [<file_name>]
i.e. -f/-s,-e/-d are mandatory options but -f&-s cannot be used together and the same as with -e&-d options - cannot be used together. How can I put the check in place?
Another question, if I may ask at the same time: How can I use -w option (when used) with or w/o a value? When no value is supplied, it should take the default value otherwise the supplied one.
Any help greatly appreciated. Cheers!!

It's been a while since I did anything with optparse, but I took a brief look through the docs and an old program.
"-f/-s,-e/-d are mandatory options but -f&-s cannot be used together and the same as with -e&-d options - cannot be used together. How can I put the check in place?"
For mutual exclusivity, you have to do the check yourself, for example:
parser.add_option("-e", help="e desc", dest="e_opt", action="store_true")
parser.add_option("-d", help="d desc", dest="d_opt", action="store_true")
(opts, args) = parser.parse_args()
if (parser.has_option("-e") and parser.has_option("-d")):
print "Error! Found both d and e options. You can't do that!"
sys.exit(1)
Since the example options here are boolean, you could replace the if line above with:
if (opts.e_opt and opts.d_opt):
See the section How optparse handles errors for more.
"How can I use -w option (when used) with or w/o a value?"
I've never figured out a way to have an optparse option for which a value is, well, optional. AFAIK, you have to set the option up to have values or to not have values. The closest I've come is to specify a default value for an option which must have a value. Then that entry doesn't have to be specified on the command line. Sample code :
parser.add_option("-w", help="warning", dest="warn", default=0)
An aside with a (hopefully helpful) suggestion:
If you saw the docs, you did see the part about how "mandatory options" is an oxymoron, right? ;-p Humor aside, you may want to consider re-designing the interface, so that:
Required information isn't entered using an "option".
Only one argument (or group of arguments) enters data which could be mutually exclusive. In other words, instead of "-e" or "-d", have "-e on" or "-e off". If you want something like "-v" for verbose and "-q" for quiet/verbose off, you can store the values into one variable:
parser.add_option("-v", help="verbose on", dest="verbose", action="store_true")
parser.add_option("-q", help="verbose off", dest="verbose", action="store_false")
This particular example is borrowed (with slight expansion) from the section Handling boolean (flag) options. For something like this you might also want to check out the Grouping Options section; I've not used this feature, so won't say more about it.

You should try with argparse if you are using 2.7+.
This section should be what you want.
Tl;dr:
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group()
group.add_argument('--foo', action='store_true')
group.add_argument('--bar', action='store_false')
makes --foo and --bar mutually exclusive. See detailed argparse usage for more informations on using ArgumentParsers
Remember that optparse is deprecated, so using argparse is a good idea anyway.

Python, optparse and file mask

if __name__=='__main__':
parser = OptionParser()
parser.add_option("-i", "--input_file",
dest="input_filename",
help="Read input from FILE", metavar="FILE")
(options, args) = parser.parse_args()
print options
result is
$ python convert.py -i video_*
{'input_filename': 'video_1.wmv'}
there are video_[1-6].wmv in the current folder.
Question is why video_* become video_1.wmv. What i'm doing wrong?

Python has nothing to do with this -- it's the shell.
Call
$ python convert.py -i 'video_*'
and it will pass in that wildcard.
The other six values were passed in as args, not attached to the -i, exactly as if you'd run python convert.py -i video_1 video_2 video_3 video_4 video_5 video_6, and the -i only attaches to the immediate next parameter.
That said, your best bet might to be just read your input filenames from args, rather than using options.input.

Print out args and you'll see where the other files are going...
They are being converted to separate arguments in argv, and optparse only takes the first one as the value for the input_filename option.

To clarify:
aprogram -e *.wmv
on a Linux shell, all wildcards (*.wmv) are expanded by the shell. So aprogram actually recieves the arguments:
sys.argv == ['aprogram', '-e', '1.wmv', '2.wmv', '3.wmv']
Like Charles said, you can quote the argument to get it to pass in literally:
aprogram -e "*.wmv"
This will pass in:
sys.argv == ['aprogram', '-e', '*.wmv']

It isn't obvious, even if you read some of the standards (like this or this).
The args part of a command line are -- almost universally -- the input files.
There are only very rare odd-ball cases where an input file is specified as an option. It does happen, but it's very rare.
Also, the output files are never named as args. They almost always are provided as named options.
The idea is that
Most programs can (and should) read from stdin. The command-line argument of - is a code for "stdin". If no arguments are given, stdin is the fallback plan.
If your program opens any files, it may as well open an unlimited number of files specified on the command line. The shell facilitates this by expanding wild-cards for you. [Windows doesn't do this for you, however.]
You program should never overwrite a file without an explicit command-line options, like '-o somefile' to write to a file.
Note that cp, mv, rm are the big examples of programs that don't follow these standards.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is the canonical way of handling sys arguments in Python? - python

Related

Python - argparse - (Singleton) Argument w/ optional parameter [duplicate]

Prevent expansion of wildcards in non-quoted python script argument when running in UNIX environment

How to require one command line action argument among several possible but exclusive?

optparse(): Input validation

Python, optparse and file mask

Categories

Resources