Argument checks best practices - python

I have a simple code that needs at least 1 argument. Right now my code format looks something like this:
import modules
# argparse stuff
parser = argparse.ArgumentParser()
parser.add_argument(-m)
parser.add_argument(-u)
args = parser.parse_args()
# check the number of arguments
if len(sys.argv) > 3:
sys.exit()
if len(sys.argv) == 1:
sys.exit()
class Program:
def A():
def B():
def C():
if __name__ == '__main__':
try:
Program()
The code works as intended, but I'd like to know how I can rewrite my code to be 'pythonic'. Do I put the argument checks under the 'if name' statement? If so, how? thanks.

I would suggest not looking at sys.argv, especially if you're already using a CLI parsing library.
Argprase has a pile of ways to enforce requirements, but if none of those fit your needs you can looks at your 'args' object.
Personally, I would suggest not running functions, like parse_args(), in the global scope of that file. Instead I would suggest (at minimum) to just wrap what you've got in a function called main, then call 'main()' after 'if __name__ == '__main__'
Argparse examples:
if '-m' and '-u' are mutually exclusive
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument('-m')
group.add_argument('-u')
args = parser.parse_args() # will output a error message if '-m' or '-u' isn't supplied
If a specific arg is required always
parser = argparse.ArgumentParser()
parser.add_argument('-m', required=True) # must always give '-m'
Or just looking at the 'args' object
parser = argparse.ArgumentParser()
parser.add_argument('-m')
parser.add_argument('-u')
args = parser.parse_args()
if not (args.m or args.u):
sys.exit(1) # should exit non-zero on failures
main wrapping example:
import modules
class Program:
def A():
def B():
def C():
def main():
parser = argparse.ArgumentParser()
parser.add_argument(-m)
parser.add_argument(-u)
args = parser.parse_args()
if not (args.m or args.u):
sys.exit(1)
try:
Program()
except SomeException:
# handle it
pass # b/c I don't know what you need here
if __name__ == '__main__':
main()

Checking the number of arguments after argparse doesn't make much sense. If there's some error, argparse will handle that, so you don't really have to replicate it.
Do put the arguments check after if __name__ check - just in case you want to import the module without executing.
Otherwise, it's just standard code as you'd see in argparse documentation. Nothing really wrong with it.

Related

how to test python file with pytest having if __name__ == '__main__': with arguments

I want to test a python file with pytest which contains a if __name__ == '__main__':
it also has arguments parsed in it.
the code is something like this:
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Execute job.')
parser.add_argument('--env', required=True, choices=['qa', 'staging', 'prod'])
args = parser.parse_args()
# some logic
The limitation here is I cannot add a main() method and wrap the logic in if __name__ == '__main__': inside it and run it from there!
The code is a legacy code and cannot be changed!
I want to test this with pytest, I wonder how can I run this python file with some arguments inside my test?
Command line arguments are an input/output mechanism like any other. The key is to isolate it in a "boundary layer", and have your main program not depend on them directly.
In this case, rather than making your program access sys.argv directly (which is essentially a global variable), make it so your program is wrapped in an "entry point" function (e.g. "main") that takes args as an explicit parameter.
If you don't pass any arguments to ArgumentParser.parse_args, defaults to accessing sys.argv for you. Instead, just pass along your args param.
It might look something like:
def main(args):
# some logic
pass
if __name__ == '__main__':
main(sys.arv)
Your unit tests can call this entry point function and pass in any args they want:
def test_main_does_foo_when_bar():
result = main(["bar"])
assert "foo" == result
Generally I make a main function which always takes arguments and call it with command line arguments when needed.
def main(args):
# parser : argparse.ArgumentParser
args = parser.parse_args(args)
pass
if __name__ == '__main__':
main(sys.argv[1:])
When run from command line it will be called with command line arguments. It can be called at anytime by calling main directly.

How to test if __name__ == "__main__" with passing command line arguments?

Hi I want to test my executable module main.py.
In this module there is function main() that takes two arguments:
# main.py
def main(population_size: int, number_of_iterations: int):
...
At the bottom of this module there is logic that takes command line arguments and executes main function:
# main.py
if __name__ == "__main__":
# create parser and handle arguments
PARSER = argparse.ArgumentParser()
PARSER.add_argument("--populationSize",
type=int,
default=-1,
help="Number of individuals in one iteration")
PARSER.add_argument("--numberOfIterations",
type=int,
default=-1,
help="Number of iterations in one run")
# parse the arguments
ARGS = PARSER.parse_args()
main(ARGS.populationSize, ARGS.numberOfIterations)
I want to test passing command line arguments. My test method that doesn't work:
# test_main.py
#staticmethod
#mock.patch("argparse.ArgumentParser.parse_args")
#mock.patch("main.main")
def test_passing_arguments(mock_main, mock_argparse):
"""Test passing arguments."""
mock_argparse.return_value = argparse.Namespace(
populationSize=4, numberOfIterations=3)
imp.load_source("__main__", "main.py")
mock_main.assert_called_with(4, 3)
The error that I get is that mock_main is not called. I don't know why. To my understanding I mocked main function from main module. Mock of main function is neccessary becouse it's time consuming, and what I want to only test here is that parameters are passed correctly.
From this post I took way of mocking argparse module.
Like all code you want to test, wrap it in a function.
def parse_my_args(argv=None):
PARSER = argparse.ArgumentParser()
PARSER.add_argument("--populationSize",
type=int,
default=-1,
help="Number of individuals in one iteration")
PARSER.add_argument("--numberOfIterations",
type=int,
default=-1,
help="Number of iterations in one run")
# parse the arguments
return PARSER.parse_args(argv)
if __name__ == '__main__':
args = parse_my_args()
main(args.populationSize, args.numberOfIterations)
ArgumentParser.parse_args processes whatever list of strings you pass it. When you pass None, it uses sys.argv[1:] instead.
Now you can test parse_my_args simply by passing whatever list of arguments you want.
# test_main.py
#staticmethod
def test_passing_arguments():
"""Test passing arguments."""
args = parse_my_args(["--populationSize", "4", "--numberOfIterations", "3"])
assert args.populationSize == 4
assert args.numberOfIterations == 3
If you further want to verify that the correct arguments are passed to main, wrap that in a function and use mock as you did above.
def entry_point(argv=None):
args = parse_my_args(argv)
main(args.populationSize, args.numberOfIterations)
if __name__ == '__main__':
entry_point()
and
#staticmethod
#mock.patch("main.main")
def test_passing_arguments(mock_main):
"""Test passing arguments."""
entry_point(["--populationSize", "4", "--numberOfIterations", "3"])
mock_main.assert_called_with(4, 3)
I usually write my command-line code like this. First rename your existing main function to something else, like run() (or whatever):
def run(population_size: int, number_of_iterations: int):
...
Then write a main() function which implements the command-line interface and argument parsing. Have it accept argv as an optional argument which is great for testing:
def main(argv=None):
parser = argparse.ArgumentParser()
...
args = parser.parse_args(argv)
run(args.popuplation_size, args.number_of_iterations)
Then in the module body just put:
if __name__ == '__main__':
sys.exit(main())
Now you have a proper main() function that you can easily test without fussing about the context in which it was called or doing any sort of weird monkeypatching, e.g. like:
main(['--populationSize', '4', '--numberOfIterations', '3'])

argparse.ArgumentParser ArgumentError when adding arguments in multiple modules

I am working on automated test framework (using pytest) to test multiple flavors of an application. The test framework should be able to parse common (to all flavors) command line args and args specific to a flavor.
Here is how the code looks like:
parent.py:
import argparse
ARGS = None
PARSER = argparse.ArgumentParser()
PARSER.add_argument('--arg1', default='arg1', type=str, help='test arg1')
PARSER.add_argument('--arg2', default='arg2', type=str, help='test arg2')
def get_args():
global ARGS
if not ARGS:
ARGS = PARSER.parse_args()
return ARGS
MainScript.py:
import pytest
from parent import PARSER
ARGS = None
PARSER.conflict_handler = "resolve"
PARSER.add_argument('--arg3', default='arg3', type=str)
def get_args():
global ARGS
if not ARGS:
ARGS = PARSER.parse_args()
return ARGS
get_args()
def main():
pytest.main(['./Test_Cases.py', '-v'])
if __name__ == "__main__":
main()
Test_Cases.py
from MainScript import get_args
ARGS = get_args()
def test_case_one():
pass
Executing MainScript.py fails with following error:
E ArgumentError: argument --arg3: conflicting option string(s): --arg3
So the problem is that you have declared
PARSER.add_argument('--arg3', default='arg3', type=str)
in a global scope inside MainScript.py. That means that that line of code will be executed every time you import it like you do in Test_Cases.py hence why you get the conflict error, you're adding arg 3 to your argparse twice.
Easiest solution is to move PARSER.add_argument('--arg3', default='arg3', type=str) into your main() function as that will only get called once.
def main():
PARSER.add_argument('--arg3', default='arg3', type=str)
pytest.main(['./Test_Cases.py', '-v'])
But doing that causes another problem stemming from your multiple definition of get_args(). When you call get_args() before your main() it only has the two defined arguments from parent.py so it's missing arg3. If you move the call down into your main() or at least after your main() gets called it will work.
Personally I just removed both the definition and the call of get_args() from MainScript.py and it worked just fine.

how to elegantly parse argumens in python before expensive imports?

I have a script, which parses a few arguments, and has some expensive imports, but those imports are only needed if the user gives valid input arguments, otherwise the program exits. Also, when the user says python script.py --help, there is no need for those expensive imports to be executed at all.
I can think of such a script:
import argparse
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--argument', type=str)
args = parser.parse_args()
return args
if __name__ == "__main__":
args = parse_args()
import gensim # expensive import
import blahblahblah
def the_rest_of_the_code(args):
pass
if __name__ == "__main__":
the_rest_of_the_code(args)
This does the job, but it doesn't look elegant to me. Any better suggestions for the task?
EDIT: the import is really expensive:
$ time python -c "import gensim"
Using TensorFlow backend.
real 0m12.257s
user 0m10.756s
sys 0m0.348s
You can import conditionally, or in a try block, or just about anywhere in code.
So you could do something like this:
import cheaplib
if __name__ == "__main__":
args = parse_args()
if expensive_arg in args:
import expensivelib
do_stuff(args)
Or even more clearly, only import the lib in the function that will use it.
def expensive_function():
import expensivelib
...
Not sure it's better than what you already have, but you can load it lazily:
def load_gensim():
global gensim
import gensim
If you only want to make sure the arguments make sense, you can have a wrapper main module that checks the arguments and then loads another module and call it.
main.py:
args = check_args()
if args is not None:
import mymodule
mymodule.main(args)
mymodule.py:
import gensim
def main(args):
# do work

In Python, can I call the main() of an imported module?

In Python I have a module myModule.py where I define a few functions and a main(), which takes a few command line arguments.
I usually call this main() from a bash script. Now, I would like to put everything into a small package, so I thought that maybe I could turn my simple bash script into a Python script and put it in the package.
So, how do I actually call the main() function of myModule.py from the main() function of MyFormerBashScript.py? Can I even do that? How do I pass any arguments to it?
It's just a function. Import it and call it:
import myModule
myModule.main()
If you need to parse arguments, you have two options:
Parse them in main(), but pass in sys.argv as a parameter (all code below in the same module myModule):
def main(args):
# parse arguments using optparse or argparse or what have you
if __name__ == '__main__':
import sys
main(sys.argv[1:])
Now you can import and call myModule.main(['arg1', 'arg2', 'arg3']) from other another module.
Have main() accept parameters that are already parsed (again all code in the myModule module):
def main(foo, bar, baz='spam'):
# run with already parsed arguments
if __name__ == '__main__':
import sys
# parse sys.argv[1:] using optparse or argparse or what have you
main(foovalue, barvalue, **dictofoptions)
and import and call myModule.main(foovalue, barvalue, baz='ham') elsewhere and passing in python arguments as needed.
The trick here is to detect when your module is being used as a script; when you run a python file as the main script (python filename.py) no import statement is being used, so python calls that module "__main__". But if that same filename.py code is treated as a module (import filename), then python uses that as the module name instead. In both cases the variable __name__ is set, and testing against that tells you how your code was run.
Martijen's answer makes sense, but it was missing something crucial that may seem obvious to others but was hard for me to figure out.
In the version where you use argparse, you need to have this line in the main body.
args = parser.parse_args(args)
Normally when you are using argparse just in a script you just write
args = parser.parse_args()
and parse_args find the arguments from the command line. But in this case the main function does not have access to the command line arguments, so you have to tell argparse what the arguments are.
Here is an example
import argparse
import sys
def x(x_center, y_center):
print "X center:", x_center
print "Y center:", y_center
def main(args):
parser = argparse.ArgumentParser(description="Do something.")
parser.add_argument("-x", "--xcenter", type=float, default= 2, required=False)
parser.add_argument("-y", "--ycenter", type=float, default= 4, required=False)
args = parser.parse_args(args)
x(args.xcenter, args.ycenter)
if __name__ == '__main__':
main(sys.argv[1:])
Assuming you named this mytest.py
To run it you can either do any of these from the command line
python ./mytest.py -x 8
python ./mytest.py -x 8 -y 2
python ./mytest.py
which returns respectively
X center: 8.0
Y center: 4
or
X center: 8.0
Y center: 2.0
or
X center: 2
Y center: 4
Or if you want to run from another python script you can do
import mytest
mytest.main(["-x","7","-y","6"])
which returns
X center: 7.0
Y center: 6.0
It depends. If the main code is protected by an if as in:
if __name__ == '__main__':
...main code...
then no, you can't make Python execute that because you can't influence the automatic variable __name__.
But when all the code is in a function, then might be able to. Try
import myModule
myModule.main()
This works even when the module protects itself with a __all__.
from myModule import * might not make main visible to you, so you really need to import the module itself.
I had the same need using argparse too.
The thing is parse_args function of an argparse.ArgumentParser object instance implicitly takes its arguments by default from sys.args. The work around, following Martijn line, consists of making that explicit, so you can change the arguments you pass to parse_args as desire.
def main(args):
# some stuff
parser = argparse.ArgumentParser()
# some other stuff
parsed_args = parser.parse_args(args)
# more stuff with the args
if __name__ == '__main__':
import sys
main(sys.argv[1:])
The key point is passing args to parse_args function.
Later, to use the main, you just do as Martijn tell.
The answer I was searching for was answered here: How to use python argparse with args other than sys.argv?
If main.py and parse_args() is written in this way, then the parsing can be done nicely
# main.py
import argparse
def parse_args():
parser = argparse.ArgumentParser(description="")
parser.add_argument('--input', default='my_input.txt')
return parser
def main(args):
print(args.input)
if __name__ == "__main__":
parser = parse_args()
args = parser.parse_args()
main(args)
Then you can call main() and parse arguments with parser.parse_args(['--input', 'foobar.txt']) to it in another python script:
# temp.py
from main import main, parse_args
parser = parse_args()
args = parser.parse_args([]) # note the square bracket
# to overwrite default, use parser.parse_args(['--input', 'foobar.txt'])
print(args) # Namespace(input='my_input.txt')
main(args)
Assuming you are trying to pass the command line arguments as well.
import sys
import myModule
def main():
# this will just pass all of the system arguments as is
myModule.main(*sys.argv)
# all the argv but the script name
myModule.main(*sys.argv[1:])
I hit this problem and I couldn't call a files Main() method because it was decorated with these click options, eg:
# #click.command()
# #click.option('--username', '-u', help="Username to use for authentication.")
When I removed these decorations/attributes I could call the Main() method successfully from another file.
from PyFileNameInSameDirectory import main as task
task()

Categories

Resources